# Ten Tips for Writing CS Papers, Part 2

Thu 10 December 2015 -

This continues the first part on tips to write computer science papers.

## 6. Ideal Structure of a Paragraph

A paper has different levels of formal structure: sections, subsections, paragraphs, sentences. It is important to ensure that the structure of the content aligns well with the formal structure because the formal structure is readily perceived by the reader, whereas the structure of the content is not. With a good alignment we make it easy for the reader to have the right mental model for the organization of the content; this enables a better navigation and memory of the content.

An important consequence of a well organized paper is to minimize the possible surprise for the reader. In general you may want to surprise readers with how amazing your method or achievements are, but not through the organization of the paper.

How to align the content with the formal structure? There is more to say about this and I recommend the references at the end of this article, but here I want to focus on the structure of one or multiple paragraphs. The basic rules are:

1. One paragraph should contain only a single idea or a single point of argumentation.
2. The beginning and the end of a paragraph glue the paragraph into the surrounding content.

There is an ambiguity as to what constitutes a separate idea and indeed paragraphs may be of quite different lengths.

To achieve a good structure, here is a recipe that works for me. For a section I would like to write I make a list of bullet points of things I want to say, with one bullet point being a single idea or important point. Each point may have one or more dependencies on other points and I use the dependencies to order the list. Finally, I write one paragraph for each item on the list and I may add an additional paragraph at the beginning and end of the section to connect the section to the surrounding content.

I found that this recipe also makes my job as a writer easier because it overcomes my writing inhibition in two ways. First, I can start by simply making a list and this does not feel like writing. Second, once the ordering of ideas is clear, the actual writing becomes a lot simpler.

Here is an example of a less-than-ideal paragraph from Section 2.3 in (Gehler and Nowozin, 2008).

"As already mentioned to our knowledge (Argyriou et al., 2006) were the first to note the possibility of an infinite set of base kernels and they also stated the subproblem (Problem 1). We will defer the discussion of the subproblem to the next section and shortly comment on the differences of the Algorithm of (Argyriou et al., 2006) and the IKL Algorithm. We denote with $$g$$ the objective value of a standard SVM classifier with loss function $$L$$."

Let us reverse engineer the content of this paragraph, then restructure it. The paragraph makes two points: first, a connection to the work of (Argyriou et al., 2006). Second, it establishes some notation. So it should perhaps be split into two paragraphs.

For the first point, the beginning is also less than ideal: "as already mentioned to our knowledge"; it is a bit redundant and apologetic to point out that we already mentioned it and that we may not know better. The second point, the notation, is okay by itself, but it is unclear why it follows the first: is it done in order to enable the comparison between approaches? We would need to read ahead to find out. (This is indeed the case.) Here is a proposed improvement:

(Argyriou et al., 2006) first recognized the possibility of an infinite set of base kernels and we now discuss the connection to our work.

To make the connection explicit we first establish the notation we will use throughout the paper. We use $$g$$ to denote the objective value of a standard SVM classifier, where $$L$$ is the loss function.

It is simpler to read and makes it clear why we introduce the notation. Also note the end and beginning of the two short paragraphs: the end of the first paragraph tells you what comes next ("the connection to our work"), the beginning of the second paragraph tells you how this is done (through notation). The flow between the two paragraphs is natural now and they could almost be merged into one again with the single point of the resulting paragraph being "the connection between (Argyriou et al.) and our work".

## 7. Avoid Ambiguous Relative Pronouns (This, These, That, Which)

When used properly, a relative pronoun, such as "this", "these", "that", "which", can effectively refer to a previously mentioned noun, and that has to be remembered by the reader.

In the previous sentence, which entity did "that" refer to? Is it "a previously mentioned noun"? Or is it "a relative pronoun"? Or is it the proper use?

Ambiguities of relative pronouns are common because the writer does not experience the ambiguity. After all, it is clear to the writer what he refers to. Train yourself to recognize any potentially ambiguous relative pronoun, ideally by using a highlighter to mark them in a printout.

To resolve the ambiguity the easiest solution is simply to add the noun it refers to. For the above example, "that" would become "that noun".

(Another issue I ran into frequently is in deciding between "which" in cases where "that" should have been used, such as in "We use an algorithm which is efficient." I remember annoying a former American colleague of mine by using "which" a bit too often. Some advice is available.)

Here is a real example from an ICDM 2008 paper of mine. I highlight all relative pronouns.

Extracting such geometric patterns from molecular 3D structures is one of the central topic in computational biology, and numerous approaches have been proposed. Most of them are optimization methods, which detect one pattern at a time by minimizing a loss function (e.g., [14, 15, 6]). They are different from our approach enumerating all patterns satisfying a certain geometric criterion. In particular, they do not have a minimum support constraint. Instead they try to find a motif that matches all graphs.

This is not the worst example but can be improved nevertheless. The first "which" is best removed, the other relative pronouns are best clarified. Here is a proposed improvement:

Extracting such geometric patterns from molecular 3D structures is one of the central topic in computational biology, and numerous approaches have been proposed. Most of them are optimization methods, detecting one pattern at a time by minimizing a loss function (e.g., [14, 15, 6]). These optimization methods are different from our approach enumerating all patterns satisfying a certain geometric criterion. In particular, other methods do not have a minimum support constraint and instead try to find a motif that matches all graphs.

## 8. Provide Continuation Markers

Continuation markers are sentences or paragraphs, typically at the beginning of sections, to tell the reader what will be presented next and to tell the reader how it is relevant or how it relates to what has been presented already. It provides structure and flow, connecting the different parts of the paper.

Here is an example, from an ICCV 2015 paper:

"3. Method

We now describe our model for tracking fast moving objects. While the motion model is standard, the observation model for raw ToF captures is a novel contribution."

Note two elements here: first, there is an explicit statement of what will be presented next (the model for tracking fast moving objects). Second, we establish relevance with respect to the contribution.

There are two reasons why thinking about natural continuation markers for reading the paper is important. First, it enables navigation through the paper by allowing the reader to skip sections more efficiently. Second, without the necessary background it may take a reader multiple repeated readings to fully understand the paper. If you lost the reader, providing a natural re-entry point makes it easier to continue reading the paper despite a lack of understanding of some parts.

Both reasons are especially important for reviewers, a special type of reader. Ideally the reviewer is an expert in the field already, so we would like to make it easy for him to quickly navigate to relevant parts of the paper. Less ideally, the reviewer is working under time pressure or without keen interest in the work; in this case we would like to minimize misunderstanding or missing important points during reading.

It is important to co-locate the continuation markers with the actual text itself. It is not sufficient to provide a mini table-of-contents as part of the introduction ("In Section 2 we present related work. In Section 3 we present our method. etc.").

## 9. Multiple Authors

It is a reality that most computer science papers are authored by multiple authors. Coordinating the writing between multiple authors can be challenging on both the level of content and in terms of technology.

In terms of content, in my experience a recipe for disaster is to divide the paper into parts and agree that "Author A will write the introduction, author B will write the method, etcetera". The resulting draft will be incoherent and everyone has an excuse for delaying their part due to perceived dependencies ("I will write the method once the notation is defined in the introduction", "I will write the introduction when we have results").

Also, when dividing up work this way the draft can be poorly balanced in terms of relevant parts, as sub-authors tend to be assigned to the parts they have contributed to the most, which provides an incentive to describe their own contribution in too much detail (for example senior authors writing the introduction will fill it discussing their past research agenda that led to this work; the author writing about the implementation will want to go into detail because it was really difficult to get it to work and people may miss just how difficult it was, etcetera).

It is better to assign responsibility to a single author to write a full draft, then iterate together over this draft. There are two reasons why it is better: first, clear responsibility gets stuff done; second, the draft will be more coherent with a more linear flow of arguments.

The single author draft works best if the draft writer is an experienced author because iterating on a poorly organized draft may take more effort than a complete rewrite. When iterating on a draft it is important to distinguish substantial from minor changes. Minor changes are changes that fix issues locally, such as adding a sentence for clarification, changes of word order, typos, etc. These changes are important but not urgent. Most accomplished authors I know prefer to make these changes in passes through the full paper, much like polishing the paper with each reading.

Substantial changes are things like addition or removal of sections, changing the order of the presentation, enlarging or shrinking the claimed contribution, etcetera. Such changes can have large implications on the other parts of the paper which need to be addressed and therefore such changes are important and urgent because they require less time if made early.

In terms of technology, I frequently experienced problems due to the diversity of authors and their working style. Often some authors will be senior authors with a proven but dated work setup, for example, not using basic version control systems and being stuck in an unflexible editor that mangles LaTeX every time it opens a file. To be fair, these authors are often most essential in terms of providing feedback on the content of the paper and they may have little time available to stay up to date with the latest tools. For addressing this problem with technology, my recommendations are the following:

1. Use a version control system: this should almost go without saying and even if you are the sole author of a paper it is best to use a version control system because it provides a simple method to back your work up. But for multiple authors coordinating the writing of a paper without a version control system is simply a waste of time and nerves of everyone involved.
2. Use a friendly version control system that provides a simple web interface; Bitbucket is my favorite for paper writing because it offers free private git repositories and allows you to view changes in a neat timeline in the browser. While hardly surprising to any git user, this feature is readily appreciated by everyone. Also, for minor changes Bitbucket actually allows editing from within the browser.
3. For yourself: when writing LaTeX write one sentence in a line and use a line break after each sentence. This makes merging conflicts easier and leads to fewer surprises with strange editors breaking long lines. (I also found that this helps me to improve the organization of a paragraph because every sentence now starts at the beginning of a line.)
4. When you need only high level feedback from your coauthors, sending them a PDF for annotation via email may still be the most efficient way.

## 10. Authorship and Author Ordering

Except for the writing itself, another common problem with multiple authors is discussions about authorship and author ordering. While not related to writing papers per se, I do want to share some remarks on this topic. There are only a few common situations where debates about author ordering arise. Here are a few common examples, with the more common cases first:

1. A small contributor or someone involved in early discussions wants to be a co-author, but other authors disagree based on the amount of time they contributed.
2. There is a PhD student, a post-doc, and a faculty author and in most computer science venues the recognition is strongest for the first and last author position. The post-doc feels he guided the student the most so deserves to be recognized, but the faculty member may feel different based on seniority or being the source of funding.
3. Two or more students contributed to a piece of work and see their contribution as the strongest; this happens sometimes when a student postpones a line of work and another student is continuing with the work, directed by a joint supervisor.
4. Two or more senior authors feel that they started or guided the project the most.

Obviously there is no "right way" to handle all circumstances, and indeed computer science handles authorship differently to, say, mathematics, for example. Of course everyone agrees that scientific authorship should imply substantial contributions to the work, but that is about as ambiguous a statement as can be made. To be more concrete, here are some observations.

First, some conflicts can be anticipated, for example the case of two students. Here, it is best to discuss a possible publication and authorship as soon as the second student gets involved. This discussion should be summarized via email for future reference. Likewise for the case of the small contributor, as soon as it is clear the work will end up in a publication a discussion should help to set expectations, for example to offer authorship only if additional work is invested.

Second, as a young PhD student one naturally underestimates the implicit future benefits that arise from co-authorship. For example the senior co-authors may present the work at venues otherwise inaccessible, or the work will lead to substantial future collaborations with the original co-authors.

Third, when considering whether to include a small contributor as co-author, the problem is most often not the co-authorship itself, but possible future actions by the contributor after the paper is published (for example, giving seminar talks about the paper). The other authors may then feel that the credit and opportunities are taken away from them. By discussing not just the co-authorship itself early but instead also what future paper-related actions are done by whom these problems can be avoided. For example, all authors may agree that seminar and job talks about the work should only be presented by the lead author.