The SVN Model of Scientific Writing

It has been far too long since I actually coded something, and I miss it.  I actually enjoy creating software.  It’s part puzzle-solving, part artistic expression.  I have, however, been advancing my Ph.D. research.  So, in an attempt to justify my lack of software, I thought of a way to think of my research in terms of software development.

All it is is a simple epiphany.  Advancing research is like software development.  Instead of committing code, you write a research paper.  And thus, the SVN Model of Scientific Writing was born (in my head).  For the uninformed, SVN (short for Apache Subversion) is a software version control system, a tool that facilitates the storage of software (in truth, any text file) for purposes of archival, retrieval, and distributed development.  I admit that, without understanding SVN, this post becomes tedious to go through.  Especially, since I want to (generally) keep my posts brief.

The SVN Model of Scientific Writing is simple:  each of the SVN commands has an interpretation in research.  Like the real SVN, there is an appropriate use of the rSVN commands (commands for the research-SVN system).  The rSVN commands are syntactically equivalent to SVN commands, and we will discuss some of the most basic commands in the sections that follow.

rsvn checkout

This command is used to pull a research trunk relevant to your research onto your local research.  Usually, this means finding a foundational paper or series of papers relevant to your field.  If you are into AI (as IA ((get it?)) ), you are in luck because there exists a website that contains a Reading List for the Qualifying Examination in Artificial Intelligence at Stanford (It’s from ’96, but the list is still very useful).

You should only need to do this once.  But research is full of fun moments like existential life crises, and discovering really cool applications completely outside of your field, so you may need to execute rsvn checkout several times.

Also, you may need to do this both for the general field and for your depth area.

rsvn update

When you are done checking out the foundational paper / series of papers, you will probably execute rsvn update, in an attempt to find work which has progressed from that starting point.  This is where the wonderful tool Google Scholar comes in.  Type in the foundational paper’s name in Google Scholar, and click on the “Cited by…” link to begin your rsvn update.

Image

rsvn add

When you have progressed in your research, you execute this command by submitting a paper for publication.  You normally won’t add incomplete works in the svn, and the same goes for rsvn.  Execute this command when you have completed a measurable quanta of research work.

rsvn commit

Like most development environments, to earn the right to commit the file, you must demonstrate that the work is as bug free as possible and that it accomplishes a specific measurable task.  This is where peer-review comes in.  If the peer-review process comes back with a favorable result, then you will execute rsvn commit by submitting the camera-ready copy of your paper.

 

Those are some of the basic commands you will execute.  The number of svn commands is (currently) much greater than the number of rsvn commands.  I am, however, interested in seeing more mappings for some of the other svn commands, and I may come back to revise this list as those mappings become clear.

Leave a Comment

Filed under Philosophy of Science

A Map of the Music Spectrum

About one year ago, I interned at Apple.  This post isn’t about my internship experience, although that might make an interesting post.  It’s about one simple interaction I had with a (then) co-worker.

I was in the ever-present awkward phase of the internship where you are getting to know your co-workers.  You know, the phase where you try to act as normal as possible so as to not create awkward situations that would lead you to be ostracized for the subsequent 3 months of the internship.

In one of my exchanges, a co-worker asked me:  ”what kind of music do you like?”  Without thinking, I blurted “I’m pretty versatile.  From classical rock to electronica, I can listen to pretty much anything.”

…wait, what?

I must have repeated that phrase in my head at least 4 times after I said it.  In my head, I somehow created a spectrum that started at classical rock and ended at electronica.  The phrase I uttered is content-less, since it really depends on the person interpreting it for it to make any sense.  For example, what exactly lies at the half-way point of this spectrum?  I bet that you’ll get a different answer depending on who you ask.

However, as silly as it sounded back then, I can’t help but think there must be something that links different genres of music together.  Alas, I am not a music theorist, so my conjectures are quite limited.  It is fun to think of, though.

Leave a Comment

Filed under Philosophy of Art

The Cognitive Modules of Video Game Agents

(This is a cross-post from my entry in the Liquid Narrative blog at NC State University)

I am currently enrolled in the graduate-level class “Introduction to Cognitive Science” (by the way, it’s incredibly interesting).  We recently came across a particular hypothesis that I feel could be applicable to the design and implementation of agents:  the mind modularity hypothesis.  My conjecture is that a modular approach to game agents will make their development easier and will allow for flexible character behaviors.  I will outline my reasons for this in the subsequent discussion.

The mind modularity hypothesis was first espoused by Jerry Fodor in his seminal book:   The Modularity of the Mind.  In this book, Fodor argues that the mind is made up of semi-autonomous cognitive modules.  In fairness, Fodor wasn’t the first to come up with this idea.  The idea can be traced back to Franz Joseph Gall, a phrenologist known for trying to pin specific mental functions down to particular regions of the brain.  While you might not immediately remember Gall’s work, you might recall a picture of what he was advocating for:

Gall thought you could pin down mental functions to specific regions of the brain.

A definition of phrenology with chart from Webster's Academic Dictionary, circa 1895 (Image courtesy Wikipedia)

While Fodor did not agree with Gall’s view that cognitive faculties could be read off the shape of the skull, he did agree with Gall’s characterization of cognitive modules as independent information processing mechanisms.  Fodor also considered these modules as informationally encapsulated, that is, restricted to certain types of information provided by outside stimuli.  Fodor discussed certain candidate mechanisms in human cognition that could be partitioned into specific modules (this list is not exhaustive):

  1. Color perception
  2. Shape analysis
  3. Grammatical analysis of perceived utterances
  4. Face recognition

Fodor thought that these modules could be hierarchically organized.  For example, the information processed by the color perception module and the shape analysis module could serve as input for the face recognition module.  Fodor also thought that there must exist a central control unit, which coordinated the information processed by all the modules harmoniously.

My main point is this:  creating a set of artificial intelligence modules for video game agents could make their development easier and could allow for flexible character behaviors at game run-time.  Cool idea, huh?

Well, someone already thought of it.  Stavros Vassos’ approach is through the field of cognitive robotics, in which robots are programmed with similar reasoning modules that allow for higher reasoning.   Vassos gives a list of reasons of why this is a good idea, and proposes two cognitive modules:

  1. A path-finding module (good ol’ A* search)
    Agents in a video game typically need to move, so A* is the weapon of choice.  Arguably the bread-and-butter of the games A.I. world, A* search enjoys widespread use.
  2. A temporal projection module
    This module would “keep track of the current state of the world and predict how it changes when actions take place”.    Vassos comments that the only well known reasoning module ever used in a commercial game is a simplified version of a STRIPS planner used in the game F.E.A.R. (2005).

At the end of his talk, Vassos points out that A.I. researchers and developers would find feedback from the video games industry useful for developing such modules.  Since I haven’t found anything beyond this link (admittedly, my search was not very deep), I propose two more modules that I think would be useful:

  1. Ranked selection module
    This module is inspired by a blog post on #AltDevBlogADay, written by Luke Hutchinson, who was an engine programmer at the (now defunct) Team Bondi.  Hutchinson summarizes a situation which is a recurring problem in games:  “Given n items, find the “best” k of them”.  Let’s say you are playing your favorite RTS game.  The enemy needs to prioritize which one of your units it needs to attack.  Given, your n units, it must select k units to directly target.  I’m sure you can think of additional situations where this is useful.  Hutchinson has so graciously provided pseudocode for implementation of an algorithm to do precisely this type of rationalization.
  2. Computer error module
    This second module is inspired by Lars Lidén’s paper:  “Artificial Stupidity:  The Art of Intentional Mistakes”.  This module would be responsible for inserting errors into the calculations of the agent so as to make the agent think “humanly”.  The frequency of errors that will be introduced will probably fluctuate in proportion to the level of intelligence that the game developer wishes to gift the A.I. agents.

I’m sure that there are many more modules that could potentially make this list and maybe even modules that don’t belong, so I will probably come back (in spirit) to the list to add and remove modules accordingly.  With all these modules available for use, a game developer could mix-and-match to create a diverse populace of the game world, with distinct cognitive faculties.  Moreover, it would (theoretically) be a matter of generalizing the particular module, such that it is potentially available for all agents in the game and then limiting access to the cognitive modules according to your heart’s content.

Leave a Comment

Filed under Philosophy of Cognitive Science, Philosophy of Computer Science

The problem with being precise

I love science.  Science demands precision.  I should love precision.

I don’t like being precise; at least, not around people I’m not familiar with.  I feel I come across as a jerk.  At least, that’s my theory of mind of the person I am talking to when I am precise.  I always envision the opposite party in conversation judging me how I used to judge people when they were being precise.  I have only recently begun to enjoy being absolutely precise in my wording and speaking.

Why?  Because people immediately associate precision with pedantry.  And pedantry is bad; at least, people don’t seem to appreciate a pedant.  In fact, the dictionary on my laptop defines “pedantic” as:

a person who is excessively concerned with minor details and rules or with displaying academic learning.

I don’t think that’s fair.  If you are trying to present an artefact of science, precision is necessary.  You can’t just say “people seemed to react differently in these two experimental groups” – there must be some way of demonstrating beyond reasonable doubt that a phenomenon was observed as evidenced by a particular data set.  Being chastised for being precise is probably a remnant of the time when jocks ruled the Earth and being nerdy was still uncool.  To be precise is to desire understanding, to be elegant in thought and to be clear in conviction.

Next time you think that someone is pedantic, think instead that the same person is just being precise.  You might see them differently.

Leave a Comment

Filed under Philosophy of Science

An artists’ Hello World

I was recently discussing art styles with my wife.  We were heading home from the coffee shop that has become my new haunt when she started explaining what she does when she wants to experiment with a new style.  “I draw my face”, she said.  Curious as to why, I looked at her, quizzically.

“Well,” she continued, “just think.  What other figure have you known for your entire life?  I mean, you’ve seen yourself many times throughout your life.  You know your curves, your smile, your defects.  What else could you master so well?”

The simplicity of the argument and its elegance made me think.  “It’s an artists’ Hello World.”, I voiced.  She smiled, because she knew what I was talking about.  After more than 4 years of hearing me think about programming out loud, she was programming-savvy enough to know that all programmers strive to complete a “Hello, World!” exercise prior to venturing forth in a new programming language.

I wonder if artists universally do that – encode themselves (or some other object) every time they try out a new medium.  It’s kinda neat to think that there is a parallelism between art and programming.

Leave a Comment

Filed under Philosophy of Art

Reply: “The Ph.D. problem: are we giving out too many degrees?” by Kate Shaw, ArsTechnica

Before I start, my bias is that I’m a Ph.D. student, so take what I say from that perspective. A colleague of mine recently brought to my attention the article of the above name.

The article can be found here.

Stay thirsty, my friends.

The world's most interesting man thinks you should get a Ph.D.

My opinion is echoed in the article; too often are Ph.D. students divorced from real world applications. The dream of “tenure” is slowly fading away because there is a finite number of tenured (professor) positions, and those are mostly already taken by professors with established research tracks. Ph.D. students should have training in real-world applications, so that they can pursue jobs/work outside of traditional academia.

However, as we go into the future, I think we need more Ph.D.’s, not less. Humanity faces (on a daily basis) increasingly complex problems. The true value of a Ph.D. (as opposed to standard undergraduate education) is the ability to do research on topics that are ill-defined/possibly non-existent and produce scientific work of value. If more people are capable of coming up solutions to problems we’ve never faced before, I’d say humanity stands a chance to face issues such as global warming, finite energy resources, overpopulation, and space exploration. I’m not saying that undergraduates out of top-tier institutions aren’t capable of doing that once they’re out the door…but on average, I’d be willing to take a chance in saying that they need a little more education/scientific training (which could go a long way).

There is a long-standing fallacy of innovation (often applied as a “social concern”) which sounds like: “if we create this new technology, we’ll put people who do this work manually out of business”. Like all fallacies, this is just not true; it assumes a finite amount of work in the world. I think something similar applies for Ph.D.’s: while there is a finite amount of Ph.D. tenured positions, there is not a finite amount of Ph.D. level work in the world.

Re-interpreting the most interesting man in the world:
“Stay thirsty (for knowledge) my friends.”

Leave a Comment

Filed under Philosophy of Education

A Guide on How To Evaluate Papers in Computer Science

[TL;DR  Try answering all questions below.]

When I started graduate school, I realized found out the hard way that “we’re not in Kansas anymore”.  Many professors told me it was all about the research, and not at all about grades.  Grudgingly, I stopped obsessing over assignments, and started obsessing over research papers.  However, the transition wasn’t easy.  Learning how to evaluate research papers critically is a continuous process, and I tried to find a starting point online on how to evaluate a research paper in Computer Science to no avail.  Each scientist must find their own “groove” or “style” when evaluating another scientist’s work.

I’ve been fortunate enough to be surrounded by really bright colleagues who have shaped my thought process, as well as really bright mentors who have taught me how to think critically.  This blog post unites their collective wisdom into one guide: A Guide on How To Evaluate Papers in Computer Science.

I warn you that this is not a be-all, catch-all, one-stop solution on evaluating papers.  I am sure there are millions of ways to evaluate a paper.  What I hope to provide with this guide is essentially a starting point for your “thinking process” (which I could not find elsewhere).

How I suggest you use this guide is essentially by filling in the blanks the questions pose after reading the paper.  If you decide to fill it in as you go, go back to the questions when you’re done and make sure your answers still hold; you must verify whether you captured the essence of the paper.

1. What are the main points?

Every self-respecting evaluation of a paper will have at least a summary of the work that was presented.  The key is to not cite the paper at all in your description; paraphrase the ideas and think of how you would explain the idea to someone in Computer Science outside your area of expertise.

2. What did I not understand?

This is usually tough for scientists because it explicitly indicates a lack of understanding.  Don’t worry.  It’s not that you aren’t smart, rather that currently you are not sure of what the article is stating.  The key to this question is not saying “I don’t get X”, but rather, “I don’t get X, but I think it means Y.”

3. If I were to build a system out of this, how would I do it?

The beauty of computer science is that even if the content of the paper is highly theoretical, at some point the technical knowledge can be made concrete.  Ask yourself, “How can we do that?”  Often you’ll find implementation flaws that indicate theoretical flaws…this feeds itself because you can then identify ways to improve the theory.

** If the paper itself discusses an implementation, question 5 will be particularly relevant.

4. Does this paper solve the field?

The answer to this question should be no.  The answer to this highly rhetorical question is to inject (what I think is) healthy satire in your analysis.  Unless the paper proves that P = NP, it hasn’t solved the field.  This leads us to the next sub-question.

  • If not, what did they do wrong?

You might be thinking “oh well, this is just trying to break this paper”.  And that is exactly what you are doing.  Science is not art.  It is subject to scrutiny, review and analysis of hypothesis, methodology and results (more on this later).

Find something that they did wrong, or did not do at all.  I’m sure there is something.

  • If so, what did they (the authors) consider that no one else in the universe did?

Once again, this is a very rhetorical question.  If you’re answering this question, it is highly likely that you’re not being critical enough.  Go back and review more!

5. Is there any other way to do what the authors did?

This is very relevant if the authors are describing a system in their paper.  If not, the question should be focused towards other methodologies to evaluate the same hypothesis.  The previous statement is actually a perfect set-up for my next set of questions.

6. Does the experimental claim have validity?

Recall the field you (presumably) are studying.  Computer Science.  Computer Science.  Computer Science.

Science.

This is a scientific field and we’re still at the mercy of the scientific process.  The paper should try to assess some scientific claim.  If you’re rusty on your scientific terminology, recall that this claim is the research hypothesis.

Go back to your “main points” question (first one).  If you did not write the hypothesis of the article, it’s a bad sign for the paper. If you can guess the hypothesis, but can’t answer any of the following questions because you can’t find the answers, it’s another bad sign.

Caveat:
There are many papers in Computer Science that don’t evaluate any claim.  Survey or position papers are an example.  However, the good papers that don’t evaluate anything usually explicitly state that or something similar.  What you have to be careful for are unwarranted or unsupported claims.

As a general heuristic, you should ask yourself: “Is this paper worth anything?”

  • Measurement validity: “Can we trust the process?”

This basically asks whether the process makes sense.  What you’re looking for is whether or not the phenomena’s operationalization makes sense.  In other words, does the phenomena they study map to what they measure?

  • Internal validity: “Can we trust causal assertions?”

This question asks whether or not we can trust that the phenomena we are measuring is a direct consequence of an external variable.  Recall that the hypothesis revolves around observing an effect as a direct consequence of another one, and depending on how well the authors have crafted their experiment, your trust in their causal assertions should vary.  Internal validity can be broken up into the next three points.

  • Is there temporal ordering? (“Does x come before y?”)

This tries to establish the dependent variable in terms of the independent variable.  Does the author’s claim (related to the “y”) come as a result of their perturbation (related to “x”) ?

  • Is there association? 
(“Do x and y move in a pattern?”)

If the author’s introduced change (“x”) really does alter the phenomena they are making claims about (“y”), then you should be able to predict the next “y” for a change in “x”.  If they move erratically, that’s a red flag.

  • Can we rule out any and all rival explanations?

As one of my mentors once explained to me, the answer to this question is “proof by lack of imagination”.  It is impossible to list all ways the “y” could be changed due to changes in factors x1, x2, x3, x4 …. xn.

That being said, that is no excuse for the authors not “doing their due diligence” and think really hard to eliminate/consider all possible factors.  If any thing exists that was not considered, its impact should be minimal or equally impacting (a bias, not an error) to the entire experiment.  This sort of implies you can “factor it out” when analyzing.

  • External validity: “Can we generalize the findings?”

This depends directly on how they obtained their sample size for observing the “y”.  Look at the sample size used (usually the fault of many papers), the way participants were recruited, and the statistical tests used.  If these three things check out, it is possible that it’s all good.

Phew!  That was a brain-dump.  Having exposed my method, I will close with this:  The scientific process is not perfect.  It is really easy to think that after such rigorous in-depth analysis, if an article meets the required scientific backup, it is fact.  If only it were so easy.  The scientific process is a man-made construct that tries to impose order and logic in the chaos that is the universe.  Let’s just say it might miss a few things.

Playing both sides of the devil advocate role, even though it is not perfect, it’s the best process we have.  It is our role as scientists to critically evaluate the work of others and not take anything for granted.  Doing so is the only way to find the paths that lead humanity to bigger and better places!

Leave a Comment

Filed under Guide, Philosophy of Computer Science