Biozenformatics: April 2015

Intro

Humans have been inventing better, faster, and more accurate ways of doing computation for literally thousands of years. Examples of this idea include Archimedes' Antikythera Mechanism, considered the first analog computer (Wallis, 2008) from about 100 B.C. to Charles Babbage's Difference Engine in 1822 ("Difference engine - Wikipedia, the free encyclopedia," 2015), to the "First Computer Program" Lady Ada Lovelace devised to run on the Difference Engine ("Ada Lovelace - Wikipedia, the free encyclopedia," 2015), and Hollerith's punched cards for the 1890 census ("Punched card - Wikipedia, the free encyclopedia," 2015).

Among the prestigious company mentioned above, a name that cannot go unmentioned is Alan Turing. Recently, Dr. Turing has received a royal pardon and other much-deserved publicity and he is well known in Computer Science circles for invention of his "Turing Test" for Artificial Intelligence. While Turing is widely recognized for his contributions to cryptography in World War II, what is sometimes overlooked is his accidental contribution to general computer science and the impact it made on the war effort at the time.

From Theoretical Mathematics to Computers

Originally, Dr. Turing was writing a paper on mathematics called "On Computable Numbers, with an Application to the Entscheidungsproblem" (Turing, 1937, p. 230). In summary, Dr. Turing was attempting to answer the question, does there exist an algorithm that can prove every mathematical statement, no matter what it is, true or false? This, then, is the essence of the Entscheidungsproblem, or Decision Problem (Weisstein, n.d.).

To help answer this question, Dr. Turing described a computing machine capable of reading and writing "symbols" to a paper tape. This machine could be programmed with a set of internal rules before starting to read the tape and the combination of the previously read symbol and the configuration would constitute a new configuration - essentially equivalent to the machine state.

Eventually, this like of thought lead Dr. Turing to ask if the computer is capable of all possible computations, then is it possible for one computer to read another computer's code and determine if it will halt or go on forever. Turing was able to prove that it was not possible to create a program to read another program and determine if it would halt and then used this fact to prove the original question - that there is no way to always say if a mathematical statement is provable (Campbell, n.d.).

From Paper to Big Iron

Perhaps more important than the proof in Dr. Turing's paper was the almost off-handed mention of storing "symbols" on paper tape which can be read by the computer and will affect its state. It doesn't take much imagination to realize that this is the conceptual model for a stored-program computer.

While it is probable that the stored-program computer would have come about as a natural evolution of humanity's quest for mechanical calculating machines however we will never know. Shortly after Turing's paper was published the world was plunged into a war that found several countries clamoring for better and faster calculations.

Mathematician John Von Neumann is generally credited with invention of an early computer architecture that, among other things used the stored-program concept ("Von Neumann Architecture") for his work on the EDVAC computer and the associated paper published in 1945 "First Draft of a Report on the EDVAC." Von Neumann was working on The Manhattan Project at the Los Alamos National Laboratory - a computationally bound project. It is unknown whether Von Neumann was aware of Turing's paper but it is worth noting that Von Neumann was at Cambridge in 1935 and Princeton from 1936-1937 at the same time as Turing and the two were, in fact, acquainted ("Von Neumann architecture - Wikipedia, the free encyclopedia," 2015).

Prior to the EDVAC, the ENIAC computer was the state of the art in computation however it had a few drawbacks. Namely, it had no stored-program memory. Programming was quite literally done by the error-prone process of rewiring the circuitry and could take three weeks to program ("Von Neumann architecture - Wikipedia, the free encyclopedia," 2015).

At about the same time, Turing was working for England's Bletchley Park, site of the Government Code & Cipher School. He created the Bombe electromechanical machine that helped find German Enigma machine settings but was not a calculation device. It was more of a brute-force decryption device, having several Enigma machines wired together ("Bombe - Wikipedia, the free encyclopedia," 2015).

More importantly to the topic at hand, Turing also contributed to the Colossus project - a top-secret stored-program computer used by Bletchley Park for decryption of the Lorenz cipher (Copeland, 2006). Although the Colossus was designed by an engineer named Tommy Flowers with Turing's work on probability being incorporated in the design. Ironically, due to this work, Turing knew that the stored-program concept was feasible before the Von Neumann paper but was not allowed to reveal this fact due to the secretive nature of the Colossus project.

References:

Ada Lovelace - Wikipedia, the free encyclopedia. (2015, April 22). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Ada_Lovelace#First_computer_program

Bletchley Park - Wikipedia, the free encyclopedia. (2015, April 12). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Bletchley_Park

Bombe - Wikipedia, the free encyclopedia. (2015, April 13). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Bombe

Campbell, M. (n.d.). New Scientist TV: See how Turing accidentally invented the computer [Web log post]. Retrieved from http://www.newscientist.com/blogs/nstv/2012/06/how-turing-accidentally-invented-the-computer.html

Colossus computer - Wikipedia, the free encyclopedia. (2015, April 27). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Colossus_computer

Copeland, B. J. (2006). Colossus: The secrets of Bletchley Park's codebreaking computers. Oxford: Oxford University Press.

Difference engine - Wikipedia, the free encyclopedia. (2015, February 15). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Difference_engine

Punched card - Wikipedia, the free encyclopedia. (2015, April 9). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Punched_card

Turing, A. M. (1937). On Computable Numbers, with an Application to the Entscheidungsproblem. Proceedings of The London Mathematical Society. doi:10.1112/plms/s2-42.1.230

Von Neumann architecture - Wikipedia, the free encyclopedia. (2015, April 12). Retrieved April 27, 2015, from http://en.wikipedia.org/wiki/Von_Neumann_architecture

Wallis, P. (2008, July 31). Ancient Greek computer from 100 B.C.: Archimedes strikes again? Retrieved from http://www.digitaljournal.com/article/258045

Weisstein, E. W. (n.d.). Decision Problem -- from Wolfram MathWorld. Retrieved from http://mathworld.wolfram.com/DecisionProblem.html

This quarter I have a class called "Futuring and Innovation" that encourages us to create and write in a blog and I realize now that it's been quite a while since I last updated this blog. My original intent of course was to write about topics that interested me as I made my way through this Doctoral degree but as seems to happen I got busy with classes and research and have not updated as often as I would have liked.

So, in this new spirit of blogging, I'm going to discuss a video that I found entitled “Big Data is Better Data” by Kenneth Cukeir that touches on one of the topics in my dissertation. Feel free to watch the video below.

Mr. Cukier is of course discussing the innovation of big data. The buzzword "big data" is now a little bit passé however for those been living under a rock for the last couple of years the quick summary of the topic: big data refers to our ability to process information in ways previously impossible. recent advances in storage allow us to collect and keep more data than ever before and some recent advancements in computer algorithms, Namely the MapReduce algorithm made famous by Google in 2008 (Dean & Ghemawat, 2008), give us the ability to process larger amounts of data in smaller amounts time. Another unique factor is that this new processing ability can't handle unstructured data, Such as pictures or videos or sound files, whereas in the past data processing required some form of structure such as you would find in a database.

Mr. Cukier eloquently discusses some of the pros and cons of big data analytics (meaning the ability to turn the data into some sort of useful knowledge). He makes a compelling case for the collection and usage of the data and in that same vein I had a few thoughts on the matter myself.

The Good: We have become used to big data being used in our everyday lives in fairly benign ways - Amazon and Neflix recommendation engines are just two examples we take for granted. However, some possibly more life changing uses than suggesting you watch “Orange is the Next Black” or recommending shoes to go with that skirt, big data analysis is being used by the medical community to help find cures for cancer and even tailor drugs and therapies to individual patients, research to develop drought-resistant strains of rice, and the public sector to evaluate air quality and where to allocate police resources.

The Bad: There is a now-legendary story of big data analysis attempting to produce influenza outbreaks based on social media interaction that went awry (just because people are talking about the flu doesn’t mean they have it). Shaw (2014) tells of a big data researcher attempting to use cell phone data in Africa to predict peoples’ movements who thought at first to have found a predictor for cholera outbreaks when what he really found was confirmation of local flooding. Stories such as this serve to remind us that correlation does not equal causation.

The Ugly: Perhaps the most famous recent example of the misuse of big data comes from stories of the NSA collecting phone records and other data for the purpose of spying on American residents. These types of stories help us to understand the fragility of privacy. When so much data from so many sources is available, easy is it to pierce the think veil of privacy?

References:

Dean, J., & Ghemawat, S. (2008). MapReduce : Simplified Data Processing on Large Clusters. Communications of the ACM, 51(1), 1–13. doi:10.1145/1327452.1327492

Shaw, J. (2014, March). Why Big Data is a Big Deal. Harvard Magazine, 116(4), 30-35. Retrieved from http://harvardmagazine.com/2014/03/why-big-data-is-a-big-deal

Biozenformatics

Monday, April 27, 2015

Alan Turing - Accidental Inventor of the Stored-Program Computer

Intro

From Theoretical Mathematics to Computers

From Paper to Big Iron

Monday, April 20, 2015

Back in the Swing of Things