Random thoughts about strong A.I.

The Singularity is Near

Tag: Artificial Intelligence

Representing Time (Temporal Logic)

Quantum_gravity_head_FLAT

I have decided that time should be part of the base instruction set or innate knowledge of the A.I. engine. I am going with a straightforward theory of time that time flows inexorably forward, and that events are associated with either points or intervals in time, as on a timeline and familiar temporal logic of past, present and future. Theoretical physicists will be disappointed that I will skip other exotic models for now , such as time is an emergent phenomenon that is a side effect of quantum entanglement for instance.

Back to the problem at hand. Language is filled with verb tenses to describe time as the following:

I arrived in Boston.
I am arriving in Boston.
I will arrive in Boston.

Each describe the same action, but with different periods in time and all relative to “now”. Much of time is described relative to other events with the a most common of events: “now” and rarely described exactly as with a watch. This is another area for fuzzy logic (to be described in a future post). I will be working on this over the weekend.

An update on Scopes or Context

In a previous post, I had started to think about object references (proper nouns) and how that should work.  After more thought today, it seems that not only are the scopes or contexts a set of trees that get traversed to find an object reference, but the objects themselves must add have their own scopes that get added into the search order. I haven’t figured out yet how all these pieces fit together yet. Something to look forward to for tomorrow.

One could imagine a story that introduces a known character and by doing so, introduces that characters metadata into the scope for future words.

A Light at the End of the Tunnel?

light-at-the-end-of-the-tunnel

For the most of my professional life, I have ignored the various goings on within the A.I. community. First, as I have previously explained, I didn’t think that computers had the requisite computational power and second, I figured, existing research hadn’t solved the problem, so maybe the current line of thinking might cloud the way to an innovative alternative solution.

Fast forward to today and after I hope an epiphany on the Strong A.I. problem set, I have started to read the various papers of the last forty years or so. First, I am struck by how much early work has been done on such a difficult problem set. Second, that sadly, much of my own thoughts have been researched by lots of different groups in depth. That’s good news because sometimes, individual researchers have spent a lifetime on vary narrow, but important aspects of the problem set, but it does beg the question, why hasn’t anyone put all the various research together to build Strong A.I.?

Here is my thoughts on the subject. Over the last couple of days in particular, I have been reading Speech and Language Processing, by Jurafsky and Martin. I have almost finished reading the textbook and watching their videos online from Stanford. As I mentioned, I have been fortified by the approaches discussed in the book, because I have come to similar approaches independently.

I have a pretty good idea on how to build Symantec Analysis step in the parser and generate First Order Predicate Logic and the resulting inference engine (a future post). But how to get to the last step. In particular, what to do after the Symantec Analysis step in the parser. My initial thought was to use “Actions as Meaning” similar to Terry Winograd’s thoughts with his SHRDLU system. Using his own words:

“One of the basic viewpoints underlying the model is that all language use can be thought of as a way of activating procedures within the hearer. We can think of an utterance as a program – one that indirectly causes a set of operations to be carried out within the hearer’s cognitive system.”

What does this mean? It means the knowledge representation system needs to be able to run functions (actions) and maintain state. Take for instance, commands generated using SHRDLU:

Person: Pick up a big red block.
Computer: OK.
Person: Grasp the pyramid.
Computer: I don’t understand which pyramid you mean.
Person (changing their mind): Find a block which is taller than the one you are holding and put it into the box.
Computer: By “it”, I assume you mean the block which is taller than the one I am holding.
Computer: OK.
Person: What does the box contain?
Computer: The blue pyramid and the blue block.

Such systems contain a model of the current state of their domain. But what does that mean for Strong A.I. where the domain is limitless. Does that mean, the A.I. engine must model the entire universe from past to present to many possible futures?

Similar to the SHRDLU commands above, consider the following:

“Bob went to the store.”
and
“Bob goes to the store each week.”

In the first sentence, using “Action as Meaning” we would model the movement of Bob and change his location to the store. And the second, would setup a recurring event of Bob’s adventures to the store. To understand the meaning of English, does this level of detail need to be remembered and “acted” upon in the knowledge system?

My initial thought is yes, it does. After all, that’s what we do as Humans. Amazingly, we all keep these kind of mental models around in our head. Maybe the true genius is in how to create the model in such a way, that the model itself doesn’t take up the same amount of physical matter as what we are modeling? At what level do you model? If we could, would you model all of Bob’s atoms and their interactions as he moves to the store. Clearly, our brains don’t have this information, so it must not be needed to build an artificial brain. However, there would many fields of study that would appreciate this level of atomic modeling where our brains are used to think about such problems as in Microbiology.

And maybe this is why no-one has built an artificial brain yet. To do so, means to build a flexible enough model to model the universe. To build models of the infinitely small and large and common sense enough to abstract from one to the other.

An English Compiler

51FWXX9KWVL__SY300_

Over the years, I have written several compilers/interpreters: Forth, Pascal, Basic, HyperTalk (from HyperCard), SQL, dBase, SQL to name a few. As the old adage goes, “if you have a hammer everything looks like a nail”, and this problem set looked like just another compiler for me; albeit a one with a few extra wrinkles.

So, this post discusses the current state of the English compiler:

1. Sentence determination: First you have to determine sentences from an input stream. Sadly, not as easy as you might think because periods are sprinkled throughout English in things like “Dr.” and “$4.99”. I advocate that we remove all periods except at the end of lines. In the meantime, I wrote code to determine a true end of line.

2. Lexical analysis breaks the English text into small pieces called tokens. Each token is a single atomic unit of the language. This phase is also called lexing or scanning, and the software doing lexical analysis is called a lexical analyzer or scanner. I’ve decided to do word level parsing instead of using a character based one.

3. Preprocessing: Unlike a traditional compiler where macro expansion, etc. occurs in this step, I decided now would be a good time to do determination of known character patterns such as time, date, units, zip codes, telephone numbers, etc. At this point, the original string is still available, so it’s easier than later where multiple tokens might get created for these known patterns.

4. Syntax analysis involves parsing the token sequence to identify the syntactic structure. This phase builds a parse tree, which replaces the linear sequence of tokens with a tree structure built according to the rules of a formal grammar which define the language’s syntax. The parse tree is often analyzed, augmented, and transformed by later phases in the compiler. It’s during this phase that we add word level tagging and determine word relationships.

5. Stemming, lemmatisation, word identification, word joining and metadata adornment is done next. In a previous post, we talked about stemming and lemmatization, so I won’t go into it further here. During this phase, we combine proper nouns together to form full nouns. For instance, two tokens for my name “Peter” and “Chapman”, would be combined into a single token “Peter Chapman”. Also, during this step, we examine each token and corresponding word and adorn the token with additional information required in future steps. For instance, we might add metadata that a particular token is a floating point number, etc.

This is where I am today. The next step is what I am going to work on tomorrow.

6. Semantic analysis is the phase in which the compiler adds semantic information to the parse tree and performs semantic checking. Semantic analysis usually requires a complete parse tree, meaning that this phase logically follows the parsing phase, and logically precedes the code generation phase, though it is often possible to fold multiple phases into one pass over the code in a compiler implementation.

Progress: Stemming and Lemmatisation

A writer is someone who writes, and a stinger is something that
stings. But fingers don’t fing, grocers don’t groce, haberdashers
don’t haberdash, hammers don’t ham, and humdingers don’t
humding.
–Richard Lederer, Crazy English

Just finished coding and testing the stemming routines.

A stemmer is a function that returns the root of a word. The de facto gold standard is the Porter Stemmer algorithm.

In many languages, words appear in several inflected forms. For example, in English, the verb ‘to walk’ may appear as ‘walk‘, ‘walked‘, ‘walks‘, ‘walking‘. The base form, ‘walk‘, that one might look up in a dictionary, is called the lemma for the word. The combination of the base form with the part of speech is often called the lexeme of the word.

Lemmatisation (or lemmatization) in linguistics, is the process of grouping together the different inflected forms of a word so they can be analyzed as a single item

I also finished added Lemmatisation today as well. See Wikipedia for more information.

One of the things you learn about the world of natural language processing is they have a lot of made up words.

Innate Knowledge?

innateidea

It seems that some in the field of A.I. wish to start at a nerve cell as the basic building block. However, to build a floating point arithmetic using nerve cells to me seems like quite a challenge (since I have written these functions in assembly language). I decided to jump ahead several million years in evolution. A nerve cell is too low level for me. That’s not to say that we will forgo some of the underlying principals, but it doesn’t need to be our basic building block.

The first microprocessors did not support floating point instructions. Instead, floating point functions where created out of more primitive integer instructions. Somewhere in the early history of microprocessors, they added floating point instructions to the base instruction set.

This leads to a most important question: What is the base instruction set for Strong A.I. or another way of thinking about it is what is the Strong A.I. innate knowledge? Should integer instructions be included? How about floating point? What other concepts? How about time? How about things it can never directly experience such as color? Wouldn’t it be better to have a sighted person define as part of the base instruction set a concept of color rather than trying to teach a blind program about the concept? Is the base instruction set immutable. Over time, can the A.I. engine come to it’s own meaning replacing or adding to the meaning given to by the original programmers?

And it is clear, that not all learning can come from the base instruction set. Somewhere you have to bite the bullet and get to the A.I. piece. And maybe in this sense, that starting with nerve cells allows you to focus on this question without all the clutter.

I don’t have all the answers yet, but very interesting questions, indeed!

Progress: Dynamic programming and the Parser

Dynamic Programming is a method for solving complex problems by breaking them down into simpler sub-problems. A similar concept is “chunking” used to create estimates where you break the problem set down into chunks where you have enough personal experience to make a reasonable estimate for a particular chunk of work.

I applied a similar concept to the parser. I wanted to take certain known words and derive their meaning very early on. For instance, US telephone numbers are in the form (xxx)-xxx-xxxx. English has lots of specific patterns that have a very high likelihood of a particular meaning. These include things like email addresses, URLs, zip codes, file locations, money, times, dates and units. Rather than later have to deal with these after the parser and re-assemble them, I decided to deal with as a pre-processor step.

Hence the sentence: “Peter needs a 10mm wrench by 10:30” has pre-processed the “10mm” as a unit and 10:30 as a time.

English to Syntax Trees and Object References

ST

The first pass at the parser is complete for English. It generates a syntax tree for a given line of English (shown to the left here).  This has started me thinking about how to resolve object references which include pronouns.  In the example shown to the left, who is “Albert”?

The problem set is similar to a symbol table for a normal compiler.  When a computer language references a variable, it follows a language specific set of sort of pre-defined rules.  Example, first look for a local variable of the same name, then look at field names in the class and finally in the global scope for global variables.

However, English and more correctly, people seem to be very scattered and the rules are much more difficult to figure out. I think the best approach would be a stack (or maybe a tree?) based concept of different scopes. As new topics are presented, a new scope is created and pushed onto the stack (or maybe added as a peer, hence a tree?). There is a concept of current scope that points to particular scope (most likely the top of the stack).

When a reference is made, it searches using a particular order from most recent to the bottom which defines the global scope.

Certain actions would create clear delimitations of scope or branches in the tree.  For instance, opening a new document or web page for reading would define a whole new scope clearly apart from the previous scope.  But may still have vestigial  traces to the previous context in case the pages or documents are related.  Maybe as more and more relations between documents are found, the scopes would be brought closer together.  And as time progresses, scopes would be more clearly separated or deleted to save resources.

Interesting enough, I found this paper that details a somewhat similar approach to my own: A Plan Recognition Model for Subdialogues in Conversations by DIANE J. LITMAN and JAMES F. ALLEN.  The section describing the “discourse context” and “Subdialogs” is to what I refer vs. the discussion about the concept of plans.

 

The End of Nerds?

0211nerds_sub_microsoft

Computer languages with the help of programmers allow us to create programs that model almost anything, both real and imaginary. For instance, Microsoft’s Halo program simulates very realistically alien worlds. Our technology and tools today allow computers today to do things that rival and exceed Human intelligence. There is a claim to build a Strong A.I. engine that you need to have the same computing power as a Human brain. I am going to argue that you don’t. And if I am wrong, with Moore’s Law it doesn’t matter anyways. If a single low-end XBOX game console can give you Halo, surely Google’s computing power is more than enough.

So if we have the hardware to do Strong A.I., what is the problem? Answer: It’s those pesky programmers and their computer languages. To make a computer do a thing, you need a programmer and a computer language. Given specifications, time and money, and a good programming team, you can do almost anything on a computer. The problem though this is a very slow (and expensive) process. One reason for the slowness is that end user specifications must be translated into source code in a computer language such as Java, C# or C++. The next reason is that the computer languages themselves were designed from a bottom up design based on the computer hardware, not the problems they would eventually solve. As the computer languages themselves evolved and new ones created, each generation moves closer to the language of the problem set. For instance, object oriented programming introduces classes and instances to more closely model the real world. But no computer language today can use any Human language such as English directly.

What if we had a new computer language called English? Like any computer language, this new language would compile English statements into code to be executed by a computer. If we had such a thing, would we still need programmers? We might still have a need for System Analysts. These are people who take the rules of a particular problem set and compile them into well formed specifications. Even if you have English as your programming language, you still need the specifications to be coherent (which is a skill set onto itself).

This is the approach we have taken for facts collection. We are building an English compiler. Unto itself this is not Strong A.I., because you still need some special sauce to get to the next level. But an interesting step, none the less, along the path to Strong A.I. Even without the “thinking component”, an ability to parse English and compile a large interrelated set of facts is commercially useful. If any of the search engines could parse the assertions or facts from web pages on the Internet, a more useful search engine could be built. In the land of A.I., this is called knowledge representation. So our next goal is taking English as input and generating code to form fact storage.

I hope to post a video of this working later this week.

And back to the title of this post, clearly, if something like this could be built, then the need for so many programmers would be greatly reduced over time. For those old enough, I think the same thing could happen to software engineering as what happened to hardware engineers in the 1980s.

Artificial Advisors and Personal Assistants

Cortana-578-80

While Ray Kurzweil’s age of spiritual machines is yet to be awoken, the age of artificial advisors is already here. We have had them in our lives for some time.

As we mix in a bit of cheekiness to these agents, they become our personal assistants such as Apple/Nuance’s Siri, Microsoft’s Cortana, and Google Now. But we are frustrated by their limitations. Strong A.I. will fix these limitations.