### Monday, December 30, 2002

**Reflections on 2002**

Posted by Lance

We have seen many exciting theorems during the past year but once
again we have failed to prove that P≠NP or anything even close to
it. There is always next year.
The most promising sign of the past year is the increased submissions
and attendance at conferences pretty much across the board in
theoretical computer science. The large
number of students studying theory make of much of this increase. In
the late 1990's during the dot-com boom, very few students, in
particular Americans, went to graduate school in computer science. But
with the days of easy money over and the need for computer science
faculty still great, we have seen a large increase in the number of
students. These students have and will continue to bring in new ideas
and directions to our field. Let us hope there are enough jobs for all
of them.

I also started this web log during this past year. Initially, I
started this blog just to try out a new technology but I have had a
blast writing these posts and sharing my knowledge and experiences. I
hope you have enjoyed reading them. I don't understand how I rank so
high on a Google search on
"web log". Perhaps because "weblog" is
supposed to be one word.

In remembrance: Edsger
Dijkstra and Steve
Seiden.

Have a good New Years everyone!

8:17 AM
#
0 comments

### Friday, December 27, 2002

**FOCS is going to Europe**

Posted by Lance

In 2004 for the first time the FOCS conference will be
held outside the United State or Canada. The 45th FOCS conference will
be held in Italy. STOC held
its first conference outside of North America in 2001 in Greece.
First some background on these conferences quoted from the preface of
David Johnson's 1991 STOC/FOCS Bibliography.

*Since the 1960's,
two of the most important venues for presenting new results in
theoretical computer science have been the annual STOC and FOCS
conferences sponsored respectively by the Association for Computing Machinery and
the IEEE Computer
Society. "STOC" is an acronym standing for the ACM
Symposium on Theory of Computing and "FOCS" stands for what
is now called the IEEE Symposium on Foundations of Computer
Science. The STOC Conference is organized by ACM's Special Interest
Group on Automata and Computability Theory (SIGACT) and has been held every spring
since 1969. The FOCS conference is organized by what is now called the
IEEE Technical Committee on Mathematical Foundations of Computer
Science (TC-MFCS) and is
held in the fall. It began in 1960 as a "Symposium on Switching
Circuit Theory and Logic Design" (SCT&LD), changed its name
in 1966 to "Symposium on Switching and Automata Theory"
(SWAT), and assumed its current name in 1975.
*

A few updates: SIGACT now stands for the Special Interest Group on
Algorithms and Computation Theory. The 33rd STOC in 2001, besides
being the first held outside the US or Canada, is also the first held
in the summer. The SWAT acronym has been appropriated by the
Scandinavian Workshop on
Algorithm Theory.

FOCS and STOC have changed in other ways. Until the 80's, they were
the only major outlet for conference papers in theoretical computer
science. Now there are several specialized conferences including, of
course, the IEEE Conference on Computational Complexity. Many researchers put
a greater emphasis on their specialized conference than STOC and
FOCS. This is somewhat true in complexity but much more so in say computational learning theory.

While most, but not all, of the best papers in theory still appear in
STOC and FOCS, these conferences no longer reflect the broad interests
of theoretical computer scientists. This was probably an inevitable outcome
of a field maturing and becoming more specialized.

10:07 AM
#
0 comments

### Thursday, December 26, 2002

**Find the Longest Path**

Posted by Lance

How about some music? Here is an oldie but goodie. Daniel Barrett wrote Find the Longest Path during a final exam at Johns Hopkins in 1988.
It is an algorithms song sung to the tune of Billy Joel's For the Longest Time. You can listen to the song (1.9 MB MP3)
or just read the lyrics.
8:12 AM
#
0 comments

### Monday, December 23, 2002

**A Note on Running Times**

Posted by Lance

Suppose you have two papers to review that give algorithmic
improvements for two different problems. Here n is the input size.
**Paper A**: Old Running
Time was 4^{n}. New Running time is 2^{n}.

**Paper B**: Old Running
Time was n^{4}. New Running time is n^{2}.

Which paper gives the best improvement? This is not such an easy
question to answer. Here are three ways to look at the question with
three different results.

**Analysis 1**: Consider the improvement as a function of the input
size: Old Running Time divided by New Running Time. Paper A is the
clear winner with an improvement of 2^{n} over the
n^{2} improvement of Paper B.

**Analysis 2**: Consider the improvement as a function of the old
running time. Here we have a tie; in both papers the new running time
is the square root of the old running time.

**Analysis 3**: Suppose we are interested in the largest problem we
can solve with current technology. Fix a time t and consider the
largest problem we can solve in time t. In the Paper A we go from (log
t)/2 to log t, a factor of 2 improvement. Paper B does much better
going from t^{1/4} to t^{1/2}, a quadratic
improvement.

6:02 AM
#
0 comments

### Sunday, December 22, 2002

**Pictures of Manindra**

Posted by Lance

Here is a sketch of Manindra Agrawal that I scanned in from the New York Times magazine article I mentioned last week. For contrast I added a photograph of Manindra and his student co-authors Neeraj Kayal and Nitin Saxena (from right to left) from the IIT Kanpur Primes in P page.

A complexity theorist immortalized in art. Brings tears to my eyes.

6:02 AM
#
0 comments

### Friday, December 20, 2002

**Rules for a Complex Quantum World**

Posted by Lance

Michael Nielsen has a recent Scientific American
article giving a very readable view of
quantum information and computation. Nielsen is also co-author with Isaac Chuang
of Quantum Computation and Quantum Information, the best book I've seen on the topic.
9:48 PM
#
0 comments

### Thursday, December 19, 2002

**Foundations of Complexity **

Lesson 11: NP-Completeness

Posted by Lance

Previous Lesson |
Next Lesson
Informally, NP-complete sets are the hardest sets in NP. What does it mean
to be hardest? Here we use a polynomial-time version of reduction,
a concept we first saw in
Lesson 5.

Formally a language L is *NP-complete* if

- L is in NP, and
- For all A in NP, there is a polynomial-time computable function f
such that x is in A if and only if f(x) is in L.

We say that f *polynomial-time reduces* A to L.
The following theorem captures the intuition for NP-completeness.

**Theorem:** Let L be an NP-complete language. Then L is in P if
and only if P = NP.

**Proof:** If P = NP then since L is in NP then L is in P. Suppose
L is in P and A is in NP. Let f be the reduction from A to L. We can
then determine whether x is in A by testing whether f(x) is in L. ◊

In particular if any NP-complete set has an efficient algorithm then
all NP-complete sets have efficient algorithms.

Do there exist NP-complete sets? Here is an example:

L = {(<M>,x,1^{k}) | M is a nondeterministic machine
and M(x) accepts in k steps}
Here 1^{k} is just a string consisting of exactly k 1's.
L is in NP by just simulating M(x) for k steps. If A is in NP, then A must be
accepted by some nondeterministic machine M using time p(|x|) for some
polynomial p. The reduction f is just f(x)=(<M>,x,1^{p(|x|)}).

L is not a natural language; you are unlikely to see it come up
in any real-world application. In future lessons we will see that a
large number of truly natural search problems are NP-complete which is
why NP-completeness is perhaps the single most important concept to
come out of theoretical computer science.

2:32 PM
#
0 comments

### Wednesday, December 18, 2002

**Ramsey Theory and Computer Science**

Posted by Lance

Today we have a guest post from William Gasarch.
How many papers apply Ramsey Theory to Computer Science?
If you said 37 and 4 surveys then you've probably visited
www.cs.umd.edu/~gasarch/ramsey/ramsey.html
where William Gasarch has a collection
of such. A prominent theorist thinks there are
over 100. Rather than argue the point,
see if your favorite paper that applies Ramsey Theory
is there, and if not then email the reference
and if possible the paper or a link to it, to
gasarch@cs.umd.edu.

6:15 PM
#
0 comments

### Tuesday, December 17, 2002

**The Great Journal Debate**

Posted by Lance

Elsevier is closing down IDEAL the electronic access
point for Academic Press, a publisher recently acquired by Elsevier.
This leaves only Elsevier's Science
Direct for electronic access of the Academic Press and other
Elsevier journals. Given this news and today's New York Times article
I feel I should comment on the great
journal debate. As a member of the editorial board of Information and Computation,
one of the Academic Press journals, these issues give me some angst.
The internet has, of course, a large effect on the distribution of
scientific papers over the last ten years. Even more so, the consolidation
of the scientific publishing companies has put a squeeze on university
libraries.

Many of my colleagues have suggested that we just start up our own
online journals. Running a journal is more than just getting papers
refereed and sticking them on a web page. Journals have to be
marketed, maintained and presented in a format that makes information
easy to find. The private companies do a very good job of
this. However, Elsevier's recent pricing policies are causing many
libraries to drop several of their journals. Loss of access is never a
good thing.

The
professional societies, such as ACM, IEEE and SIAM have their own journals with their
own on-line access policies that might form a reasonable median. You
can also often get early versions of papers from scientist's home
pages or sites like citeseer.

I have mixed emotions on the whole journal issue. Clearly status quo
is not working--something will have to give. My biggest fear is that
scientists will just stop submitting to journals altogether. I don't
believe this is the best way to maintain knowledge for generations to
come. After all, who will maintain your web page a century from now?

10:28 AM
#
0 comments

### Sunday, December 15, 2002

**Outsider Math**

Posted by Lance

The New York Times Magazine today highlights a year full of ideas, one of which is on the Agrawal et. al. primality algorithm under the title "Outsider Math". While Manindra might
not have been an expert in number theory he was already an established member of the computational complexity community--certainly not an outsider in my mind.
8:38 AM
#
0 comments

### Friday, December 13, 2002

**Learning via Occam's Razor**

Posted by Lance

I have been asked to expand upon the spam example from yesterday's
Foundations of Complexity lesson. To do this let me go into a
little background on computational learning theory.
The basic model for learning theory has examples given as strings and
labelled positive or negative. From these labelled examples, one comes
up with a hypothesis that hopefully will classify future examples
well.

In PAC (Probably Approximately Correct) learning you want an
algorithm that takes labelled examples generated by some distribution
and will output some hypothesis that will high confidence will
usually classify future examples drawn from the same distribution.

One simple approach is the Occam algorithm based on the Occam's
Razor principle "when you have two competing theories which make
exactly the same predictions, the one that is simpler is the better."

The Occam algorithm works by finding the smallest representation of a
hypothesis consistent with the labelled examples and using that
hypothesis to predict future examples. There is a theorem in learning
theory that says this algorithm works well with the number of samples
roughly the size of the smallest representation.

Let us focus on the problem of identifying spam using general circuits,
a collection of AND, OR and NOT gates that capture computation. We'll
talk more about circuits in a future lesson but just think of the
running time of an algorithm as roughly the size of the equivalent
circuit.

Given a collection of emails each labelled either as SPAM or NOT SPAM,
the Occam algorithm requires us to find the smallest circuit that
correctly labels all of these emails. In general finding such a
circuit could be hard but under the assumption that P=NP this is an
easy task. By the result mentioned above this circuit will likely
classify SPAM correctly most of the time in the future.

Some caveats about this method.

- Why should there be a small circuit that characterizes spam? Well,
I can look at an email and determine whether it is spam and my brain
is just not that complex.
- The theorem only holds if the distribution we learn on is the same
as the distribution we apply our circuit to. Spammers might change
their spam to fool our new circuit. The P=NP assumption will make it
easier for them as well.
- The P=NP assumption is probably not true.

For more information and details on computational learning theory check out the
web site learningtheory.org
and the resources mentioned on that site.
3:07 PM
#
0 comments

### Thursday, December 12, 2002

**Foundations of Complexity**

Lesson 10: The P versus NP Problem

Posted by Lance

Previous Lesson |
Next Lesson
In Lesson
8 we looked at the class P, the set of efficiently computable
languages. In Lesson
9 we studied the class NP, the set of languages with efficiently
verifiable proofs.
Does P=NP, i.e., is every language with efficiently verifiable
proofs computable efficiently?

The P versus NP question is the most important question in
all of theoretical computer science and in mathematics in general. The
Clay Mathematics Institute lists the P versus NP question as one of
their seven Millennium
Prize Problems. Determine whether P=NP and collect a million
dollars.

To understand the importance of the P versus NP, let us imagine a
world where P = NP.

- A large class of interesting search problems in NP, thought to be
hard to solve, would have efficient solutions. These include
Factoring, Map Coloring, Traveling Salesman, Job
Scheduling and thousands of others. Half of Garey
and Johnson is just a listing of NP-complete problems.
- Public key cryptography would be impossible.
- Learning via Occam's razor would be easy. For example if you
wanted an algorithm for separating spam from useful email, just search
for the smallest circuit that correctly identifies a large set of
samples. You can do this if P=NP.

This is just the beginning. Of course the general consensus is that
P≠NP but we have no idea on how to prove it.
Many of the future lessons will deal directly and indirectly with the
P versus NP problem. With these lessons, I hope to give you a real
feel of the importance and difficultly of this most important
question.

4:13 PM
#
0 comments

### Wednesday, December 11, 2002

**FYI: The AIP Bulletin of Science Policy News**

Posted by Lance

The American Institute of Physics produces a free electronic newsletter, FYI, covering science policy in the US.
Although it has a physics bent, FYI does an amazing job educating the scientific community on how US science policy is set and what the current issues are. For example, check out this bulletin on the new NSF authorization bill.
Anyone interested or involved in science funding should subscribe to this newsletter. There is also a
FYI This Month that gives a single monthly email highlighting the important FYI bulletins.

7:48 PM
#
0 comments

### Tuesday, December 10, 2002

**CAPTCHA in the Times**

Posted by Lance

Today's New York Times carries a lengthy article about how Yahoo is using Manuel Blum's CAPTCHA project to prevent automated registrations. You
saw it here first.
5:49 AM
#
0 comments

### Monday, December 09, 2002

**SIGACT News**

Posted by Lance

The December SIGACT News is out, the first edited by David Haglin. Of interest to complexity theorists
- Lane Hemaspaandra's Complexity Theory Column has Part 2 of the Schaefer-Umans survey on Completeness in the Polynomial-Time Hierarchy.
- A book review of
*A New Kind of Science* by Stephen Wolfram. "Its not new, its not science and its not kind."
- The Education Forum discusses a proposal for a Dagstuhl-like facility in the US and a preview of the Homer-Selman book
*Computability and Complexity Theory*.
- Calls for nominations for the G�del Prize and Knuth Prize. Nominate your
favorite complexity papers and theorists.

3:37 PM
#
0 comments

**Complexity Class of the Week: MA**

Posted by Lance

Previous CCW
MA gets its name from the Arthur-Merlin games developed by Babai. MA
is an interactive proof system where the all-powerful Merlin sends a
message that is verified by a probabilistic Arthur. Here is a formal
definition.

A language L is in MA if there is a probabilistic polynomial-time
machine M and a polynomial p such that for all strings x,

- If x is in L then there is a w, |w|=p(|x|) and M(x,w) accepts with
probability at least 2/3.
- If x is not in L then for all w, |w|=p(|x|), M(x,w) accepts with
probability at most 1/3.

As with many other interactive proof classes, the 1/3 can be replaced with a
value exponentially small in |x| and the 2/3 can be replaced by 1.
The w is a proof that Merlin can write down and Arthur can verify
years later without further interaction from Merlin. Sometimes MA
is called the class of languages with publishable proofs.

Despite the naturalness of the class, there are no known natural
problems in MA not known to be in NP∪BPP.

MA contains NP and BPP and also NP^{BPP}. MA is
contained in its cousin class AM and thus BPP^{NP}. MA is also
contained in many of the same classes BPP is contained in, including
S_{2}, ZPP^{NP},
Σ_{2}∩Π_{2} and PP. There are oracles
where AM is not contained in these classes. Also none of these
containments are tight in all relativized worlds.

If L is checkable and has polynomial-size circuit then L is in
MA. I will leave the definition of checkable to another post but this
implies

If EXP is in P/poly then EXP is in MA.
We can replace EXP in the above statement with PSPACE or PP. Using the
above statement one can show that MA_{EXP} does not have polynomial-size circuits.
MA_{EXP} is like MA with polynomials replaced with exponentials.
A strong pseudorandom generator that derandomizes probabilistic
circuits will also derandomize MA to NP. Using the result of
Impagliazzo-Wigderson we get: If E require 2^{ε n}
size circuits for some ε>0 then MA = NP.

The quantum version of MA, QMA has gotten some recent attention. Here
Merlin sends entangled quantum bits and Arthur does quantum
transformations and measurements. We will discuss QMA another day when
it has its turn as complexity class of the week.

10:56 AM
#
0 comments

### Friday, December 06, 2002

**Bad Math Joke**

Posted by Lance

There are 10 kinds of mathematicians: Those who understand binary notation and those that don't.
2:25 PM
#
0 comments

### Thursday, December 05, 2002

**Foundations of Complexity**

Lesson 9: Nondeterminism

Posted by Lance

Previous Lesson | Next Lesson
To properly classify problems we will sometimes need to define models
of computation that do not exist in nature. Such is the case with
nondeterminism, the topic of this lesson.

A *nondeterministic Turing machine* can make guesses and then
verify whether that guess is correct. Let us consider the map coloring
problem. Suppose you are presented with a map of a fictitious world
and you want to give each country a color so no two bordering
countries have the same color. How many colors do you need?

The famous Four
Color Theorem states that four colors always suffice. Can one do
it in three?

Let L be the set of maps that are 3-colorable. A nondeterministic
Turing machine can "guess" the coloring and then verify quickly for
every bordering pair of countries that they have different colors.

We let NP be the class of problems that use nondeterministic
polynomial time. We can give an equivalent definition of NP using
quantifiers: L is in NP if there is a polynomial p and a deterministic
Turing machine M such that for all
x, x is in L if and only if there is a w, |w| ≤ p(|x|) and M(x,w)
accepts in time at most p(|x|).

The string w is called a witness. In the case of map coloring, w is
the coloring of the countries.

Can we solve map coloring in deterministic polynomial-time? Is every
NP problem computable in deterministic polynomial-time? Therein lies
the most important question of all, which we will discuss in the next lesson.

6:10 AM
#
0 comments

### Wednesday, December 04, 2002

**You can't judge a proceedings from its cover**

Posted by Lance

Scott Aaronson writes an ode to the FOCS proceedings cover illustration and asks about the origin of the illustration. All I can say is that the FOCS cover has remained unchanged since at least my first FOCS conference in 1986.
I can give the history of the Complexity Conference cover which should be similar since Complexity and FOCS are both IEEE Computer Society conferences.
When Complexity, then known as Structure in Complexity Theory, took on IEEE sponsorship in 1987, an IEEE artist took a look at the papers and came up with the cover with various symbols from the text. Here is a scan of the 1990 cover.

When Eric Allender took over the conference he rightly eliminated the strange symbols from the cover and in 1998 what is now the current cover first appeared.

If you are a budding artist and wish to redesign the complexity proceedings cover, we are always welcome to possibilities. If your cover is approved by the conference committee, you will be well acknowledged in the proceedings and bask in the glory of knowing your art work will live on in offices of complexity theorists around the world for generations to come.

7:42 AM
#
0 comments

### Tuesday, December 03, 2002

**Quantum Information Science and Technology Roadmap**

Posted by Lance

The Quantum Information Science and Technology Roadmap version 1.0 is now publicly available. From the web site:
*The overall purpose of this roadmap is to help facilitate the progress of quantum computation research towards the quantum computer science era. It is a living document that will be updated annually.*
Section 6.8 gives a nice overview of the theoretical computer science contributions to quantum computing.

1:07 PM
#
0 comments