(Editor's note: transcripts don't do talks justice.
This transcript is useful for searching and reference, but we recommend watching the video rather than reading the transcript alone!
For a reader of typical speed, reading this will take 15% less time than watching the video, but you'll miss out on body language and the speaker's slides!)
[APPLAUSE] Hi there. Thank you, Gary. Let's get started. The room looks much smaller from up on the stage, which is nice. Because yesterday I was in the audience and was like, oh my god, this is a lot of people. But there's not so many of you now. So, hello my name is Allison Parrish, I am a poet and computer programmer, and I'm a member of the full time faculty at New York University's interactive telecommunications program. I'm in the middle of some stuff. Usually my process looks something like this, where I start with the idea, and then it might do all this research and experimentation and so forth, and then everything sort of shrinks back down into like this tiny synthesis. And usually I like to give a talk when I'm at that synthesis point.
But right now, I'm sort of like right here. And all of these ideas are sort of still in the middle of the research, still playing around. I haven't reached any conclusions yet, and I'm not even sure what the point of any of this stuff that I'm about to talk about is. In other words, I don't know what I'm talking about yet. But, I think it's interesting and it helps me to understand it better to actually present the material. And I hope it's interesting for you too. So thank you for bearing with me
I'm going to talk about the idea of interpolation as an idea, and the history of interpolation in the arts in particular. And then I'm going to talk about some of my own work that makes use of interpolation. So my overall research project for the past couple of years has been how to make language malleable or tangible to make writing interfaces that engage the body in intuitive and non-literal ways. So basically anything other than the keyboard, right. So more particularly I'm thinking about what are the ways we can use computation and digital media to do this.
I just sort of think that writing a poem should feel like this. I've never actually used a pottery wheel, but it looks like fun. It looks like something that you would want to do. And I think that writing should feel that same way. And why shouldn't it. And in particular I want to be able to like stretch and smush language the way that you stretch and smush clay. That feels like a good interface for that. So this quote from Amiri Baraka is sort of my guiding light in my research.
It's from an essay that Amiri Baraka wrote, about how technology is an expression of culture. And he is talking specifically about the QWERTY keyboard. And he says, "a typewriter-- why should it only make use of the tips of the fingers as contact points of flowing multi-directional creativity. If I invented a word placing machine and expression scriber, if you will, then I would have a kind of instrument into which I could step and sit or sprawl or hang, and use not only my fingers to make words express feelings, but elbows, feet, head, behind. All the sounds I wanted-- screams, grunts, taps, itches, a typewriter is corny. "
This quote is proposing that we think about text composition differently. Instead of literally typing letter by letter, we should be able to use our bodies to use logics other than the literal, And I think that the work of poets and creative writers should be, partially, to build these kinds of interfaces. Also, I recently came across this article by Lori Spiegel, who is of course a brilliant composer who works with computational methods. And it opened up a lot of avenues for me. So I want to share some of that with you.
The actual article is sort of an acerbic thing. It starts with a subtweet about how musicians in her field are more focused on technical details, than on actually making music. But she has this interesting view and says the process of creating music involves not only the ability to design patterns of sound, but a working knowledge of all the processes of transformation which can be aesthetically applied to them.
So it seems like a good idea to look at old fashioned non-electronic music, and try to extract a basic library of transformations that have been successfully applied in the past. And she comes up with this list of 13 different transformations, and this is for musicians, but honestly, this list to me is sort of like Brian Eno's Oblique Strategies. Like it's worthwhile to go through this list regardless of your discipline. In particular, one of those transformations stood out to me. That's interpolation, which is filling in between previously established points. Inserting a smooth ramp between discreetly separated values. And this struck a chord with me. That was not supposed to be a pun, but I guess it was. I did not intend that.
Filling in between previously established points. And first of all It occurred to me when reading this, that interpolation is kind of exactly what I've been doing with language for a while over the past couple of years with my experimentation and arts practice. And when I get to my own stuff in this presentation, you'll see what I mean it also occurred to me, that interpolation is something that's like almost uniquely afforded by digital media, because it's digital media. So by definition, you're always starting out with discretely separated values that need to be interpolated. Because digital media only stores discrete values.
And then I was wondering who else has been working with this, like can you conceptualize interpolation as an artistic practice. And how specifically does this apply to poetry, because you don't usually think of language as being composed of established points. So how can you apply this technique to language? So we'll have some answers to that.
So very quickly, I'm going to show some examples of what I think qualify as interpolation being used in what I'm just saying is discovery. And by discovery here, what I mean is using interpolation to predict. To create something new. To find something out. If you're not familiar with the mathematical definition of interpolation, here is some illustrations that I stole from Wikipedia. If you have a set of discrete data points, interpolation gives you the answer to the question, what happens between those points. And there are various methods that you can use to connect to those points that have different benefits and drawbacks. But the point is, you're missing data. Interpolation lets you guess what the data might be, based on the data that you do have.
If you're a front end developer, or an animator, or something like that, then there's a kind of interpolation that you're likely very familiar with, and that's tweening. Which is the process of figuring out what should happen between frames, like with keyframes in animation. And I was looking all over Wikimedia Commons for an example of a tweening animation for this presentation. Then I realized wait I already made an example of that a couple of slides up, where you have one circle moves and becomes another circle, and then moves and becomes another circle. In Keynote, all I did was define those positions of the circle. And then keynote interpolated between those positions. So that's a simple example of tweening being used for a rhetorical purpose in this instance.
Morphing is another common artistic technique in animation that shows a metamorphosis between two forms. So this is the face morphing part as a segment of the face morphing sequence from Michael Jackson's Black or White video directed by John Landis. I remember watching this when I was like 10 or something. My mind was just blown by this previously unheard of visual technique. It's one of the earliest examples of computer aided morphing, and of course Animorphs shows this translation or this interpolation between a girl and a starfish. In these cases, the morphing, the interpretation, is being used to draw attention to the similarities and differences between the points on the extremes. But I don't think it's being used for discovery in this case. I don't think the point of that Animorphs cover is to help us imagine what is halfway between a girl and a starfish, it's just about showing that transformation.
An earlier example-- and is the earliest example I could find-- of morphing as a kind of interpolation, is something called tabula salada, or turning pictures. And this was common in the 16th and 17th century. It's sort of like lenticulars that we have today. And basically the idea of the turning picture is that you start with a corrugated surface, and you cut an illustration into strips, apply the strips on the aligned faces of the corrugation, and then you can put another illustration on the other side of the corrugation. So when you're looking at the image from one side or another, you see different images.
So this is a photograph of a well-known turning picture, or two photographs from different angles. This is supposedly a portrait of Mary, Queen of Scots on one side, And a skull on the other. And not all of these turning pictures were meant to depict a metamorphosis between two states. But the fact in this one, that the eyes of the queen and of the orbital sockets on the skull are aligned, to me implies that you were also intended to look at it head on. To see the middle state between the two. So that you can imagine this sort of phantasmagorical middle state between the queen and a skull.
There's this great article-- this great paper by Alan Shickman that is about turning pictures in Shakespeare's England, where he goes over all of these examples in literature contemporary to the period. Of people talking about turning pictures in like plays and poems and stuff. And there's this one excerpt here from George Chapman's Chabot, Chabot, I don't know how to pronounce it. "As the picture brought to optic reason that to all passers by seems as they move, now woman, now a monster, now a devil. Until you stand in the right light and view it, you cannot judge what the main form is."
And the fact that this shows that transition between now a woman, now a monster, and now a devil, implies the middle point was intended as part of the rhetoric of the piece. It was supposed to be an interpolation, and in the middle you are supposed to see that monstrous form. So it's kind of a visual interpolation being used in an artistic context. An example from the sciences, and I'm sure there are many more of these that you can think of, but I kind of think of the discovery of technetium, the element technetium, as an exercise in interpolation. And a number of other elements were discovered this way as well.
So I was a nerdy kid and I had a poster of the periodic table of elements on my wall. And it looked sort of like this on my bedroom wall when I was a kid. And all of the artificially prepared elements are in a different color than the rest of the chart. And most of these artificially prepared elements are the big radioactive ones. But there's technetium, right there, which is sort of in the middle. And it seems out of place.
So this was always weird to me because it was so high up in the chart, far away from the other artificial elements. Why, it turns out the technetium has the lowest atomic number of any element that has no stable isotopes. So it occurs on earth in only small trace quantities. It had never been actually observed in nature before. And we've since discovered that it does occur in trace quantities occasionally. But it was the first element to be produced synthetically, and that's why it's called technetium from the Greek root for making things.
So the interesting thing about technetium, the relevant thing to this talk, is that it wasn't officially discovered until 1937. But, its existence and properties were accurately predicted in 1869 by Dmitri Mendeleev, and he called that element the hypothetical element eka-manganese, because it was one spot below manganese on his early periodic table. So this is a kind of interpolation. He was predicting properties of something based on the surrounding data points, and Mendeleev had the confidence in the principles underlying his table. So he was able to make predictions about the actual real world by filling in the missing items from the table. So, discovery by filling in a grid. That's sort of what interpolation is about.
Scatter plots. I wanted to talk about scatter plots. A scatterplot is-- I'll show an example of it in a second-- it's a kind of elementary data visualization technique. And it takes two dimensional data-- anything with two columns in a spreadsheet-- and then shows them on a Cartesian system using one value for the x-axis, and the other value for the y-axis. I don't think that scatter plots are a kind of interpolation. They don't fit into the rubric, but it helps me talk about something else.
This is a very quick scatterplot I need to demonstrate what a scatterplot is, for those of you who might not be familiar. It's a chart of all the words in the CMU pronouncing dictionary, which I'll get back to you later. With the number of letters in the word as in the y-axis, and the number of syllables in the word as the x-axis.
And you can see there's generally an upward trend, meaning that the number of letters in a word, and the number of syllables in the word-- as you would expect-- would be correlated. In case you're wondering, the outliers are supercalifragilisticexpialidocious, antidisestablishmentarianism, deinstitutionalization, and extraterritoriality. The one letter word that has seven syllables in it, is a bug in my library for using the CMU pronouncing dictionary. I need to push a fix for that to pi pi. The request has already been accepted. I just hate the Python packaging system. I always mess it up. I always do something wrong when I release a package.
So there's another kind of scatter that I think does have a relationship to interpolation. A regular scatterplot doesn't tell you anything about the values in between. Just shows you the relationship of the values. But there's another kind of scatterplot, which I'm calling scatterplot of likenesses. This is a scatterplot, except each of the data points on the plot is represented by a likeness of that item in the data set that corresponds to that point. And this is a weird idea, I consulted with like three different research librarians at NYU on this question. And no one knew if there was a name for this particular thing. So that's why I'm proposing to call it a scatterplot of lightness. I was also thinking of a small multiples on the manifold, which is more impressive, but I think less clear. So scatter plots of likenesses is what I'm proposing.
What's interesting to me about scatter of likenesses, is that they allow for a kind of ad hoc visual interpolation. By which I mean, you can draw a line between any two data points to see what the intermediary forms between those two points might look like. So the earliest example I could find of this-- and I'm sure there is something earlier, I just couldn't find any-- is from Scott McCloud's Understanding Comics. And this chart describes different styles of comic art, and it has three different extremes. Visual verisimilitude, abstraction up top, and meaningfulness on this side. The individual illustrations on the chart are plotted according to their perceived correspondence with these three axes. And what's actually being plotted are the drawing styles of individual comic artists. So this is assuming on that same chart. So you can see that the comic, the actual illustrations, are being used to illustrate those points.
What's interesting to me about this, is that you can kind of make your own interpolation. So you can draw a line from Charlie Brown to Batman. And then you can say the comic whose style is in between Charlie Brown and Batman, is Jim Valentino's Normal Man. That's the one that I circled there. So a scatterplot with likenesses is a scatterplot that you can explore.
And I see the socialization technique a lot in machine learning applications, where high dimensional data is reduced to two or three dimensions using a manifold learning technique like TCT distributed stochastic neighbor embedding. Or other dimensional reduction techniques, and then there's a scatterplot with likenesses that is strong in the resulting lower dimensional space. And this image probably looks familiar if you follow machine learning, artificial intelligence stuff at all, these kinds of plots are extremely common.
This is a visualization of the underlying vectors from a convolutional neural network trained on ImageNet images. It's made by Andrej Karpathy. And so I have a little zoom in to this space here. Each one of these images is being plotted in the x and y position that corresponds to a dimensional reduction of the underlying vector. And so you can see as you zoom in, here you end up in dresser drawer town, or something like that. But you can see that images with similar content are being drawn next to each other.
And again, this isn't exactly interpolation, but it's a kind of data visualization that explores exploration and discovery, in how the items in the chart are similar or dissimilar to each other based on their proximity. So it affords that kind of interpolation-esque exploration and conclusions. Continuing in the machine learning vein, there's something called a variational autoencoder, which is a kind of neural network that's sort of specifically designed to make interesting interpolation possible. Very, very high level understanding of how variational autoencoder works. And I'm going to talk about auto encoders for sequences, because that's what I'm most interested in. Because language is usually represented as a sequence in machine learning models.
So the job of the variational autoencoder is to automatically learn a fixed length vector-- that thing in the middle-- that can represent anything in the input that you send to it, and then generate output from that same input. So it's basically like learning how to do data compression on the fly, just by being shown a whole bunch of examples.
And trying to reproduce those examples after squeezing it into a smaller vector representation. Now the interesting thing about a variational autoencoder, is that you can generate new output simply by plugging in like a random vector value into the encoder, and then you get a random value back out of it. Like a generated value back out from it. Or if you have an existing item, you can generate something similar to it by feeding an image in, or sequence or whatever, and then varying that vector a little bit, and then get a new image out.
So an example of that, and this doesn't show up great on the screen-- I'm sorry about that. This is David Ha's sketch-rnn, which he actually released the source code for it, and I use it for a project. and I'll show that in a second. This is a great example of a variational encoder being used for interesting artistic purposes.
So he trained this variational autoencoder on Google's quickdraw data set, which was this big database of vector drawings, that are produced by people who are playing a game made by Google. And this audio encoder learned latent vectors for all of these drawings-- cats, crabs, firetrucks, gardens. And because of the way the variational autoencoder works, you can create drawings that are interpolations between two other drawings, by sampling the latent space in between those values.
So this illustration is showing-- in the four corners, you have a pig, and a bunny, and a crab, and a human face. And it's showing all of the intermediate stages between these four corners. So this is again interpolation in order to create some kind of new interesting artifact. An earlier use of variational auto encoders that used very similar architecture, was Samuel Bowman generating sentences from continuous space. Where they fed a whole bunch of sentences into the variational autoencoder, which were all encoded as fixed length vectors. And then you could do that same thing interpolating between sentences, using that variational autoencoder.
So this again, interpolation as a way of discovery, it tells you that the sentence halfway between I went to the store to buy some groceries, and horses are my favorite animal, is horses are to buy any groceries. Interpolation for discovery and creation. These are those weird horse chairs. This is an extremely, absolutely beautiful paper that I discovered while researching this talk, that implements a system for interpreting between 3D models of animals and other things.
So this is like what if you interpolated between a horse and a chair. The key quote from that paper is, to our knowledge, no prior work in computer graphics has proposed or developed a computational approach to designing zoomorphic shapes. But they find them to be very aesthetically pleasing. You should look through this paper, because it is just amazing.
One more example of interpolation is-- and this again is focused on language. I grew up in the age of textmode, and so I'm trying to teach myself to care about fonts, which I've never cared about before. So I'm thinking a lot lately about letter forms, and how actual words appear on the screen, and how letters are composed. This is a really beautiful project by Dan Weber, who graciously gave me permission to show this in this presentation. It's called [INAUDIBLE]. In which these words that are written in a cursive font, are sort of morphing into each other over the course of the animation. I love this piece, it has such a musical effect like you can almost hear it sing when you're looking at it.
And obviously this is the use of interpolation. It's like with keyframes-- whoops, I'm going to show that again. They're obviously like the keyframes of the words that are being morphed into the subsequent word in the phrase. But what's most interesting to me about that, is these in-between states. When that morphing process is in the middle, you see these words that are sort of from in-between one word or another. In between a straight line. And the interesting thing about these is that they produce a sort of synaesthetic effect, where you get almost the meaning of both words at the same time, if you squint at it a little bit. So in this strange state of superposition, it's a very interesting poetic effect, discovering new words between these existing words.
So that's something I've been thinking about a lot lately, too. So that was other people's stuff. I want to share some of my own experiments in this area. And some of the stuff I've talked about before, you might have seen me talk about some of these things. I'm going to try to move through the technical stuff pretty quickly. So the first question is, if you want to do interpolation with language, you first have to meet the precondition of what interpolation is. You need some fixed points to interpolate between. So you need to take-- you need find some way of taking a unit of language and represent it as a sequence of numbers, or a vector. And it just so happens that word vectors are a thing. They're used very commonly in machine learning, artificial intelligence, natural language processing.
They're a very, very handy thing to know about, and to use, and they've been like the basis of my own poetic practice for the past couple of years. So very, very quick explanation of how word vectors work. There's an underlying assumption in vector representations of the meaning of words, which is the distributional hypothesis that states linguistic items with similar distributions have similar meanings. So in other words, a word is characterized by the company it keeps. And according to the distribution hypothesis, a word's meaning is just a big list of all the context that occurs in. And two words are closer in meaning, if their contexts are similar. So you can imagine operationalizing the distributional hypothesis by, for example, using a spreadsheet like this.
Start with a sentence. It was the best of times, it was the worst of times, and that will be our corpus. And then make a spreadsheet with one column for every possible, let's say, one word contexts that that word can occur in. And then for every row, we have all of the unique tokens in the text. And then in each cell, we have the count of the number of times about where it occurred in that context. The numbers in each row, then constitute the vector for each of those words. So for example, the vector for "of" is 000100010. And the vectors for "best" and "worst" are both 0001000000. These are actually identical, right? And according to the distribution hypothesis, two words that share, or have similar contexts, are also similar in meaning.
We usually think of best and worst as being antonyms, but they're also semantically related. They're both about having a strong emotional response to something. And that fact is captured by the distributional hypothesis. So if you did that process, or something similar to it, with a really, really large corpus like all of Wikipedia or the common crawler, or whatever, and then made that spreadsheet, it would have millions of columns and millions of rows. But you can compress that down using dimensional reduction techniques, and end up with a vector that looks like this. This is, according to the pre-trained vectors released by Stanford, the glove vectors, this number represents the meaning of the word cheese. So this is what computers think cheese means.
Maybe as like a more intuitive version of that, I made this tool for plotting word vectors in sort of a radial graph. This is using a 50 dimensional version of the word vectors. And each value from the vector is plotted in a circle. And you can see that all of the vectors for the numerals here, sort of have the same shape. You kind of squint to see the similarities. But the numerals have similar shapes, and then the months have similar shapes. But it's showing you that-- it's giving you this visual indication that those vectors are similar, if the words have similar contexts or are similar in meaning. And you can play around with this at that URL. So word vectors can be the points for our words, and you can think of a text as being a sequence of points, where the points are word vectors. And then with that sequence, you can do all of the same things with the text, that you can do with other kinds of sequences of vectors, like audio files and image files and so forth.
You can blur them, resample them, blend between them, and I've been playing around with this concept a lot over the past couple of years. So I probably won't read all of this, but this is a piece that I made called Frankenstein Genesis, which consists of blended word rectors. Where I'm just taking the word vector that is halfway in between each corresponding word from the text. So the word vector that's halfway between "in" and "hateful", and then the word vector that's between "the" and "day", and then the word vector that's between "beginning" and "when", and so forth throughout the entire text.
Finding the word that's closest to that in my corpus, and then creating a new text with those blended together. And when you put that all together, the average of these two texts-- the word-wise semantic average is, it starts with "unlikable the beginning disbelievers received opsec haven't exclaimed the agony accursed wrongdoer greedily was you form a monster which ageless, which painlessly the smote of this deep bewilderment body sought for spirit trifle." Et cetera, et cetera. So you lose the syntax of it using this technique, but you end up with this text, that weirdly actually feels like it is halfway between in meaning these two other texts.
Another thing that I've been experimenting with, is phonetic similarity of vectors. So the word vectors that I was just talking about, have to do with the meaning of words. I was interested as a poet in the sound of words as well. So I wanted to have a system of vectors that would allow me to make this particular-- have this piece of knowledge available to me. Computationally, the knowledge that the words "octopus" and "apocalypse" are similar in sound, and the words "inky" and "kinky" are similar in sound, but those two pairs of works aren't similar to each other. So I use this thing called the CMU Pronouncing Dictionary, which is this big database of pronunciations of English words. Because as you know, if you speak English, the way that words are spelled has nothing to do with the ways that they sound. This dictionary gives you a list of the way that word sounds.
And then I came up with a system for breaking down those phonemes, the sounds in those words, into their constituents features, and then found a way to combine them that allowed me to basically make a big feature vector for all of the words in that text. And then again using dimensional reduction technique, you can take a word, the sound, the pronunciation, of a word, and squish it down into a vector. So this is a 50 dimensional vector that represents the sound of the word abacus.
And what's interesting about this, is that once you have these vectors, all kinds of other interesting properties fall out. So for example, you can take the sound of two words, and then add them together, divide by 2 find the word whose sound is closest to that resulting vector, and then you get the average of those two sounds. So an interpolation between the sounds of two words. So, for example, halfway between "paper" and "plastic" is "peptic". Halfway in sound between "kitten" and "puppy" is "committee". Halfway between "birthday" and "anniversary" is "perversity". Halfway between "artificial" and "intelligence" is "ostentatious".
Another project that I did with these vectors, this is a book called Articulations that came out on Counterpath. It's a book of poetry, came out earlier this year. And what I did is I took all of the poems in Project Gutenberg, and assigned a sound vector, a phonetic sound vector to each one of the lines in that, and then just use a random walk to move from one line of poetry, to the line of poetry that's most similar to that. And then move to a line of poetry that's most similar to that, and so forth. So I want to read this really quick, this will only take a minute. This is an example of doing a random walk through the phonetic vector space of all of the poems in Wikipedia.
It goes, "Sweet hour of prayer, sweet hour of prayer, it was the hour of prayers, in the hour of parting, hour of parting, hour of meeting, hour of parting this, with power avenging his towering wings, his power enhancing his power, his power. Thus the blithe powers about the flowers, chirp about the flowers, the power of butterfly must be with. A purple flower might be the purple flowers, that bore the pedals of purple flowers, Or the purple aster flowered here, the purple aster of the purple aster, there lives a purpose stern. A sterner purpose fills turns up so pert and funny of motor trucks, and vans, and after kissed a stone. An ode after Easter and iron laughter stirred a wanderer turn, a wanderer a return, A wanderer a stay, a wanderer a near, been a wanderer. I wander away, and then I wander away, and then shall we wander away. And then we would wander away, away oh why, and for what are we waiting. A why and for what are we waiting, why then, and for what are we waiting."
A handful of other experiments-- I know I'm a little bit over time, but I want to share this stuff. This one didn't turn out very well, but it took four days to train the neural network, and I need to feel good about having done that. So I'm going to show it to you. So I talked about David Ha's sketch-rnn project, where what he did is he took all of the quick draw sketch drawings, and then fed them into a variational autoencoder. And what you can do is, you can put in just any sketch that you want to, and then get out a sketch that looks similar to that, that's generated by the neural network. So my thought was, what if you could do this with words.
So there's this thing called the Hershey fonts, which were developed by the United States Naval Weapons laboratory. Weirdly, I find myself in my work always reusing artifacts created by the US military in the 60s and 70s. But this is basically a sequence, a series of fonts that just use vector data. And these are great for machine learning purposes, because there's it's not like a regular true type font, where you're getting the outline, so you have to deal with cut out shapes and stuff like that. With these, it's just straight lines, which makes it very handy for training in a neural network.
So what I did is, I made a big list of several tens of thousands of words written in Hershey fonts, with the idea being I could feed them into the sketch-rnn network, and then get out words that were like the original words. It's maybe a little bit weird, and then I could [INAUDIBLE] between words the same way that David Ha did with this sketch-rnn thing. And this is what happened. You feed in essays, and you get back-- what is that [INAUDIBLE]. So it didn't work out great. I was hoping for this nice uniform latent vector space, but instead the autoencoder really couldn't reconstruct any of its inputs, and I need to debug this a little bit.
Here's some more examples of the network trying to reconstruct some of its inputs. So "paperwork" becomes "print scar limb", "Arbor" becomes "glurumb", "confusing" becomes "sunbis", "prog" becomes "gon", "dosing" is "darorons", and "replacements" is "notsimplitoroncor". So not great. [INAUDIBLE] did not learn the thing that I wanted it to learn. But, so this is the world premiere of what I'm calling the [INAUDIBLE] reconstructions. And this is a 1,000 word interpolation between two randomly sampled vectors from the latent vector space of that autoencoder. And you can, actually over the course of this piece, see some structure. And you are asking, did I use the office center of a hotel for the first time to print outs a prop for a talk, and yes I did. It was very nice.
So here's very quickly, an excerpt from this work. Pecan shick. Punchling cons mcclungs cons pawn hitch alongs. Nor schvez monchlons prawn lunch klon klon. Pence konar cons mcceran donk. Prence schoen klear posting ons, and clare sean ron turk, milan lurk, mkeons stewers, predar nar lons, gainst cons, ken gen currenchgan, monchers goshington, monch clors, and ken sherorar. pon shen shall gone ston shans gon blont cons. And it goes on like that for a while.
So it learned basically that every word in English has the syllables "onch con" in it for some reason. I need to debug this. So getting back to my original goal here, I wanted to be able to stretch and smush text. So I've been working on, what I think is the simplest possible implementation of this. And basically, conceptually, what I want to be able to do, is interpolate between short and long sentences that mean the same thing. And I'm doing this with a combination of symbolic, like rule oriented, and statistical approaches. I'm sure that there's an end to end machine learning way of doing this, where you find a corpus of phrases with their paraphrases. And you could use that to train an rnn or something, but I wanted to do something where I had a bit more fine grained control.
So with word vectors, it's easy enough to just resample the sequence to be of a different length, but in the process of that, you would lose some structure. My theory about stretching out a sentence is that when you stretch out a sentence, more structure should be introduced. Zooming into a sentence should be sort of like zooming into a fractal, where the longer the sentence goes, like there's more structure in it to keep up with that. So this is sort of what I was trying to do with my implementation of this. Basically, here's a visualization of something called a dependency parse tree, which basically assigns a part of speech, and a dependency relation to each word in a sentence. And each word has a head, and it has children, you can sort of separate out the heads and children in order to have discrete units of language, which are usually called constituents in linguistics.
So what I did-- and here's a prototype version of the tool that I made-- is I parsed the grammatical structure of sentences, I isolated the grammatical constituents from those sentences at every level of the tree, and then made a big database with those constituents, along with a semantic vector for each of the sentences, by averaging the vectors of each word and the constituent together. So then when I want to make a sentence longer, I can just replace each constituent with a semantically similar constituent that's longer, that has more characters.
When I want to make it shorter, I can replace it with a semantically similar constituent that has fewer characters. And you can recursively apply this to basically zoom in and out on a sentence, to stretch it and squish it together. And the database that I made for this tool, was again from the Project Gutenberg corpus of poetry, Which is why it has this sort of like self-consciously purple, and wordy, and florid phrasing, but I kind of like the effect of it. Think that's the end of that. "The mountain while she keeps the sea in the faith of doing howling and morally repulsive. "
So I want to finish this presentation by reading a short poem that was produced by the code underlying this tool. The seed phrase for this was "colorless green ideas sleep furiously", which is a sentence that Noam Chomsky used to refute the idea that statistical methods could capture the richness of our intuition about the grammaticality of sentences. So I'm going to read from the longest version to the shortest version.
"Seems sturdy heroes battle themed King, could the cursed one thus procure at all. Such hearty heroes, such hall fangs here where the free frank waters run. The high priests of the beautiful where troubles but rarely come. Ocean tides with your arms where many waters sing. Those rough half brothers strive to soothe her clear strong torrents. Often do I wait. Those rough half brothers sometimes feel colorless green ideas sleep furiously. Sweet angel voices come trooping. Their sad eyes again I see. Many waters jog along her paths to give her feet. Marvel billows shoot things wait.
Here's some stuff that didn't make it into the top, I'm going to post the slides in a second. And here's my information. So thank you very much.