Seven Languages In Seven Weeks: Prolog  Day 2
Over the weekend, I finished day 2 of Prolog from my Seven Languages in Seven Weeks book. Unfortunately, Prolog continues to confound me. It's actually a little disconcerting that I find this language so difficult to wrap my head around. I would have hoped that after so many years of computer programming, problem solving would have become easier regardless of context. But, what this language has taught me is that so much of my problem solving skill is directly related to the familiar context of the problem and seemingly not at all with my ability to reason. This saddens me.
I could only solve the first two problems. The third problem  array sorting  I could not solve. I hammered away at it for about 2 hours before I just gave up. Finally, I had to look up the Merge Sort algorithm for Prolog and then copied it over into my own terms. In all fairness, sorting in any language has never been something I was particularly good at; but, the extreme difficulty that I had with sorting in Prolog was, well, like I said, disconcerting.
Before I get into the homework problems, there were a few selfstudy items that I wanted to discuss. As part of our selfstudy assignment, we had to look up Prolog versions of the Fibonacci sequence and the calculation of factorial numbers. However, in an effort to really dig my teeth into this language, I wanted to see if I could solve these problems without resorting to Google.
Fibonacci Series in Prolog.
 % Define the base values for first two indicies.
 fib( 1, 1 ).
 fib( 2, 1 ).

 fib( Index, Value ) :
 % Make sure that we in a rule where the base cases can't
 % perform the predefined values.
 (Index > 2),

 % The value at this index is going to be based on the values
 % at the previous two indices. Create some local values for
 % the previous two indices. We can't seem to perform math as
 % part of our rules... but we can "let" a value be a math
 % operation on another value.
 IndexSub1 is (Index  1),
 IndexSub2 is (Index  2),

 % Prove that we can get some value for the N1 and N2 index
 % values.
 fib( IndexSub1, ValueSub1),
 fib( IndexSub2, ValueSub2),

 % Use the proof of the previous two rules to unify the Value
 % at this index.
 Value is (ValueSub1 + ValueSub2)
 .
As I've gotten deeper into Prolog, one of the things that I am seeing is that rules (as opposed to facts) will often needs subgoals whose only purpose is to disprove the rule. For example, in the code above, the subgoal:
 (Index > 2)
... is only there to disprove the rule for indices less than 3. When the rule is disproved, it allows the engine to fall back on the two previously defined Fibonacci facts in the series:
 fib( 1, 1 ).
 fib( 2, 1 ).
This seems to be somewhat of a reoccurring theme. The Prolog logic engine doesn't seem to rely on facts before rules; as such, your rules tend to need a way to disprove themselves when a fact would be the more appropriate answer.
Once I loaded the Fibonacci source code, I was able to query it:
 ? fib( 1, Value ).
Value = 1 ? a
no
 ? fib( 5, Value ).
Value = 5 ? a
no
 ? fib( 10, Value ).
Value = 55 ? a
no
Factorials in Prolog.
 % Define our base facts about factorials.
 % NOTE: We don't actually need this many (any more than the
 % first one; but, I wanted to start off with some rules so that
 % I could better think through the problem.
 factorial( 0, 1 ).
 factorial( 1, 1 ).
 factorial( 2, 2 ).
 factorial( 3, 6 ).

 % Define the rule for our factorials.
 factorial( N, Value ) :

 % Make sure the logic engine will fallback on our facts
 % whenever the N is greater than 3.
 (N > 3),

 % Get the previous index.
 NSub1 is (N  1),

 % Unify the previous factorial with the ValueSub1 variable.
 % Since factorial is based on *this* number multipled by the
 % product of all previous numbers, once we have the previous
 % factorial, getting *this* one should be easy.
 factorial( NSub1, ValueSub1 ),

 % Unify the Value as the previous product multiplied by the
 % current number.
 Value is (N * ValueSub1)
 .
Again, you can see that the first subgoal of our rule is simply to disprove the rule so that the logic engine can fallback to using the previously defined facts. In the end, the factorial calculations work in very much the same way as the Fibonacci series  getting a previous value and using it in conjunction with the current value.
Once I had the factorial source code compiled, I was able to query it.
 ? factorial( 1, What ).
What = 1 ? ;
no
 ? factorial( 5, What ).
What = 120 ? ;
no
 ? factorial( 10, What ).
What = 3628800 ? ;
no
I was happy to be able to solve the selfstudy problems without having to look them up. The joy of this accomplishment, however, was short lived as the homework problems (#3 specifically) proved to be smarter than I.
HW1: Reverse the elements of a list.
 % Reverse the elements of a list.


 % First, we want to define some based cases for our list reverse.
 reverseList( [], [] ).
 reverseList( [A], [A] ).
 reverseList( [A,B], [B,A] ).

 % Now, we want to define the extended list revering rules. This
 % will treat the list as a head/tail composition that can be
 % recursively proved by reversing the tail.
 reverseList( List, ReversedList ) :

 % First, we want to unify our List into its Head and Tail
 % composition. NOTE: We could have done this as part of our
 % rule signature; however, I think this explicit deconstruction
 % reads more easily for traditional programmers.
 (List = [HeadTail]),

 % First, we want to unify ReversedTail as the reversed version
 % of the Tail portion of the current list.
 reverseList( Tail, ReversedTail ),

 % Now, append the Head value (the first item in the current
 % list) to the reversed tail.
 append( ReversedTail, [Head], ReversedList )
 .
In Prolog, you can split a list into two parts  the head, which is the first element, and the tail, which is the rest of the elements. Now remember, we're not actually "setting variables" here  we're unifying logical statements. As such, splitting a list into its head and tail will only be "proven" if the list has at least one item to define the head. If the list is empty, this logical statement will fail, thereby disproving the rule.
Once we have this head/tail concept, reversing a list consists of splitting the list, reversing the tail, and then appending the head to the reversed tail. After compiling the source code, I was able to query it:
 ? reverseList( [], List ).
List = [] ? ;
no
 ? reverseList( [1,2,3,4,5], List ).
List = [5,4,3,2,1] ? ;
List = [5,4,3,2,1] ? ;
List = [5,4,3,2,1] ? ;
no
Notice that a list of actual length resulting in more than one possible outcome (all of which were the same). I am not exactly sure why this is happening. What I have noticed, however, is that if I comment out the 2nd and 3rd rules:
 % First, we want to define some based cases for our list reverse.
 reverseList( [], [] ).
 % reverseList( [A], [A] ).
 % reverseList( [A,B], [B,A] ).
... then only one solution is returned.
This is where Prolog starts (er, um, continues) to confuse me. It's like it was able to use a variety of rules and facts to solve the problem, and then returned the result of all solutions to that problem. It seems to do so without actually checking for the uniqueness of the given solutions.
Going back to one of our previous examples, remember how I said that we sometimes use a subgoal for the express purpose of disproving the rule? Well, the same logic can be applied to this situation. We already have facts defined for lists of length 2 or shorter; as such, we can add a subgoal to our reverseList() rule to disprove it for any lists of length 2 or shorter (thereby forcing the logic engine to rely on the facts).
 % Reverse the elements of a list.


 % First, we want to define some based cases for our list reverse.
 reverseList( [], [] ).
 reverseList( [A], [A] ).
 reverseList( [A,B], [B,A] ).

 % Now, we want to define the extended list revering rules. This
 % will treat the list as a head/tail composition that can be
 % recursively proved by reversing the tail.
 reverseList( List, ReversedList ) :

 % Split the list into the first three items. This is only to
 % disprove this rules for any lists less than three values.
 (List = [A,B,CDisprovenTail]),

 % First, we want to unify our List into its Head and Tail
 % composition. NOTE: We could have done this as part of our
 % rule signature; however, I think this explicit deconstruction
 % reads more easily for traditional programmers.
 (List = [HeadTail]),

 % First, we want to unify ReversedTail as the reversed version
 % of the Tail portion of the current list.
 reverseList( Tail, ReversedTail ),

 % Now, append the Head value (the first item in the current
 % list) to the reversed tail.
 append( ReversedTail, [Head], ReversedList )
 .
Notice that we have put all of our facts back; but, this time, the first subgoal of our rule splits the list into its first three items and the remaining tail. If the list has less than three items, this statement will not be able to unify and will therefore disprove the rule.
With this subgoal in place, querying with a list of any length now works properly:
 ? reverseList( [1,2,3,4,5], List ).
List = [5,4,3,2,1] ? a
no
If we had simply removed the last 2 of our 3 facts, we wouldn't need to jump through these hoops. That's because splitting the list into its head and tail would have automatically disproven the rule. We can get the same solution by using just an empty list fact and our original rule.
HW2: Find the smallest element of a list.
 % Find the smallest element of a list.


 % Define the base knowledge for getting the smallest value
 % in a list.
 smallestListItem( [A], A ).

 % Now, extend our smallest list item rule to include lists that
 % are longer than one item.
 smallestListItem( List, SmallestValue ) :

 % Unify our list value into it's Head and Tail.
 (List = [HeadTail]),

 % Get the smallest item in the tail.
 smallestListItem( Tail, SmallestTailValue ),

 % Let the smallest item be the smallest of the Head values
 % and the smallest tail value.
 SmallestValue is min( Head, SmallestTailValue )
 .
Like the previous problem, getting the smallest value in a list depends on using the head value in the context of the tail. In this problem, we are comparing the head value to the smallest value of the tail, which is unified recursively.
With this source code compiled, I was able to query it:
 ? smallestListItem( [], What ).
no
 ? smallestListItem( [5], What ).
What = 5 ? a
no
 ? smallestListItem( [5,1,10,2], What ).
What = 1 ? a
no
HW3: Sort the elements of a list.
I wasn't able to solve this problem. I had to look it up on this web site. Once I found the solution online, I tried to copy it over in my own terms such that I might be better able to understand it.
 % Sort the elemnts of a list.

 %  %
 %  %
 % NOTE: I was NOT able to figure this homework problem out. I
 % copied the solution from here:
 % http://en.literateprograms.org/Merge_sort_%28Prolog%29
 % ... and rewrote it in an effort to learn it more about the
 % Prolog language... which I am finding frustrating.
 %  %
 %  %


 % Create the base rules for sorting simple lists.
 mergeSort( [], [] ).
 mergeSort( [A], [A] ).

 % Now, define a rule that sorts a nontrivial list.
 mergeSort( List, SortedList ) :

 % First, get the first two items of the list. The only
 % purpose of this subgoal is to FAIL if the list is less
 % that two items long (at which case, it can be handled
 % by our base knowledge).
 (List = [FirstItem, SecondItem  Tail]),

 % Split the list into two lists.
 listDivide( List, ListOne, ListTwo ),

 % Sort the two sublists.
 mergeSort( ListOne, SortedListOne ),
 mergeSort( ListTwo, SortedListTwo ),

 % Now that we have two sorted sublists, merge them back
 % into one, completely sorted list.
 listMerge( SortedListOne, SortedListTwo, SortedList )
 .


 %  %
 %  %

 % Create our base cases for dividing a list.
 listDivide( [], [], [] ).
 listDivide( [A], [A], [] ).

 % Now, we have to deal with dividing a list that is longer than
 % one list item.
 listDivide( List, FirstHalf, LastHalf ) :

 % Pop off the first and second list items from the incoming
 % list that we need to split.
 (List = [FirstItem, SecondItem  Rest]),

 % Define the first and last halfs of the list as the first
 % and last items, respectively, plus, the rest of the list
 % divided and appended.
 (FirstHalf = [FirstItem  FirstRest]),
 (LastHalf = [SecondItem  SecondRest]),

 % Divide the rest of the list to find the Tail portions of
 % the current split lists.
 listDivide( Rest, FirstRest, SecondRest )
 .


 %  %
 %  %


 % Build the base cases for list merging. This will be when we are
 % merging any list with an empty list.
 listMerge( A, [], A ).
 listMerge( [], A, A ).

 % Now, define the rules for merging two lists that are not empty.
 listMerge( ListOne, ListTwo, MergedList ) :

 % Split the two lists.
 (ListOne = [HeadOne  TailOne]),
 (ListTwo = [HeadTwo  TailTwo]),

 % Prove that the first head is smaller or equal.
 (HeadOne =< HeadTwo),

 % Define the merged list as starting with head one  the
 % smaller of the two heads.
 (MergedList = [HeadOne  MergedTail]),

 % Deifne the merged tail as first tail and the second list.
 listMerge( TailOne, ListTwo, MergedTail )
 .

 % Now, define the rules for merging two lists that are not empty.
 listMerge( ListOne, ListTwo, MergedList ) :

 % Split the two lists.
 (ListOne = [HeadOne  TailOne]),
 (ListTwo = [HeadTwo  TailTwo]),

 % Prove that the first head is greater.
 (HeadOne > HeadTwo),

 % Define the merged list as starting with head two  the
 % smaller of the two heads.
 (MergedList = [HeadTwo  MergedTail]),

 % Deifne the merged tail as second tail and the first list.
 listMerge( ListOne, TailTwo, MergedTail )
 .
Often times, when dealing with lists in Prolog, people often like to split the list as part of the rule "signature." That is, they use list decomposition on both sides of the unification. Coming from a more traditional programming background, however, I find this approach to be less than readable. That's why my solution always uses "whole" variables in the rule signature and then splits list parameters as subgoals of the given rule. I am sure that Prolog experts would find this horrific and a misunderstanding of logical unification.
Essentially, I am trying to shoehorn the look and feel of "methods" into the structure of "rules." I know that these rules aren't methods; but, thinking about them in that way does help me wrap my head around them a bit more. I am sure that this is not good longterm strategy for understanding. But, to be honest, I'm just trying to get through this chapter without freaking out.
Anyway, with this code loaded, I was able to sort lists:
 ? mergeSort( [], List ).
List = [] ? a
no
 ? mergeSort( [5,1,10], List ).
List = [1,5,10] ? a
no
 ? mergeSort( [25,3,5,1,10,12], List ).
List = [1,3,5,10,12,25] ? a
no
After day 2, Prolog makes a bit more sense... but not much. To be honest, this is not a language that I am enjoying to any great degree. While it does seems to make certain tasks easier, it also seems to make many tasks much more complex.
Reader Comments
Many of your solutions differed a lot from mine. You used a lot less parameters in your recursive solutions and it was a good learning experience to see solutions without accumulators. I'm not sure if that is good or bad because it's hard to get a sense of the complexity. All my intuitions about running times developed using imperative languages doesn't seem to transfer over. Many of the tutorials I looked at online were fond of accumulator passing style so once I saw the pattern I started to use it in all my predicates. It might be useful as you start day three so here is the example I found online of how to reverse lists using accumulators
Basically Z collects or accumulates the elements of the first list in reverse order and when there are no more elements left we unify Z with W. Initially both [XY] and Z will be bound to lists and this was tripping me up because I was wondering what the hell W is for but it needs to be there because we are not modifying Z but instead creating a new logical variable on each recursive call and we thread a free variable around because we are going to unify it in the end with the actual reversed list. I kept forgetting that everything is single assignment.
Looking forward to seeing your work from day 3. I haven't yet started. I'm a little scared seeing as how pretty much every day of prolog has been kicking my ass.
@David,
I'm a little scared seeing as how pretty much every day of prolog has been kicking my ass.
... for real! Ha ha. Every time I finish (or attempt to finish) the homework, I leave in a really bad mood. My mind is not bending to fit into the Prolog model.
When I was looking up stuff for the homework, I did come across the accumulator approach. I sort of got it. I think part of what confuses me so much is that all the rules split the lists as part of the rule "signature." My brain simply can't handle that; I can read AND parse at the same time. That's why I always use "whole variables" on one side and then break them down on the other side. Like I said, it helps me to think of them as "functions" rather than rules.
I'll see if I can get to Day 3 tonight. It's so demoralizing :D
I know nothing about Prolog, and I'm posting this immediately after looking at your "reverse list" problem. I think I understand why you're getting three "reverse lists" when you have three starting conditions; the three are redundant. Given that the reverse of [] is [], Prolog can use your rule to predict already that the reverse of [A] is [A]. Like this. "Take [A]. Split into the first element, and the rest. The first element is A, and the rest is, hmm, there is no rest. In other words, the rest is the empty set: []. So let's reverse [], that's [], and append A, that's [A]." The same thing goes for [A,B], of course. When you put in your three rules for reversal, I'd guess that Prolog simply went through each rule to predict what the reverse of [1,2,3,4,5] would be. Since the rules were redundant, all its predictions came out the same.
Does that make sense?
@Matt,
You're probably right. I obviously can't speak to the technical reasons behind what's going on. But, to be honest, I was somewhat surprised that it didn't go "unique" on the results. I just assumed that that kind of logic would be built into a system that revolved around queries.
@Ben,
Look at all your talk about rules at the beginning of your post. I see Prolog being less "revolving around queries" as "revolving around rule systems." You give it a number of rules ("here's how you reverse X, here's how you reverse Y, etc."), and a question about the rules apply to a particular item ("according to these rules, how do you reverse [1,2,3,4,5]?") and it works out all the possible rules that might apply, and all the ways in which they might apply.
So in this case, you give it three rules to start with: reverse[] = [], reverse[A] = [A], and reverse[A,B] = [B,A]. Then you give it the general rule: "to reverse a list, reverse the tail and append the head." So it says, "OK, let's see what answer the general rule and 'reverse[] = []' gives. Now, the general rule and 'reverse[A] = [A]'. Now, the general rule and 'reverse[A,B] = [B,A]'." Since you have three starting rules, it *wants* to show you three possible results. Who knows, maybe you want to see whether any of these cases are redundant. (Mathematicians are all about trying to reduce to previously solved problems; they wouldn't want to know just the answer, but also which rules produced the same answer so that they could eliminate them.)
@Matt,
That sounds good; but again, I can't actually talk to the technical aspect of why things are working. All I can say is that if I wanted to know when a movie was playing at a theater, I wouldn't care if it was playing on two screen simultaneously at 2:00PM. I'd only want one result back. But, perhaps that's building too much "intent" into the answer.
Hi, I tried to program quicksort. It's not really that hard since quicksort can be defined purely recursive. Here is my code.
I finally got around to doing my Prolog homework so I thought I'd post my sorting answer since it's very different to both yours and Noud's:
ordered([X],[X]).
ordered(L,[PS]) : smallest(L,P), append(A,[PB],L), append(A,B,Q), ordered(Q,S).
This makes Prolog do most of the work. smallest pulls out the smallest value, P, of L and the ordered version of L has P at the beginning. P is some member of L such that it has a list A before it and a list B after it. The rest of the ordered version of L, S, is some ordered list made up of A and B appended.
Hmm, maybe this is a form of QuickSort since it pulls a pivot element out and partitions the list around that (so perhaps it's not so different to Noud's solution after all?).
@Noud,
Looks very nice. I think my biggest problem is that I never really managed a good grasp on sorting algorithms in general (NOTE: I got a 49/100M on my first Algorithms test in college ... yikes. Hey, at least I ended with a B+).
@Sean,
*Even* when you tell me what it does, Prolog still makes my head hurt :) That's definitely the most concise example that I've seen so far, both here and in Google.
I'm looking to get started with Clojure tonight; the holidays said me back a lot. I hope I haven't lost my Functional understanding.
@Ben,
I'll be very interested to see how you get on with Clojure! The lack of assignment / side effects means you have to use a functional approach to solutions, which is a great mental exercise.
I'm really enjoying Clojure and I expect to have some Clojure code in production some time soon as part of the back end of a CFML app, along with the Scala daemons that we already have in production.