Stories of how my research papers were published

Behind the scenes: the struggle for each paper

By Jeff Huang, published 2021-06-14, updated 2021-08-11

The start of my sabbatical has given me a moment to reflect on my publications. But my CV only shows a list of neatly cataloged papers: title, authors, conference. Each one appearing no different from another. But how each paper ended up published is its own story, a story about people and opportunity.

Using notes from my research journal and conference records, I reassembled the "Behind the scenes" story for each of my full research papers: 15 papers as a student, and 15 papers after becoming a professor (excluding papers not led by my research group, as I feel it's not my place to tell those stories). This is as much a reflective exercise for myself than it is for an audience.

If you read this page in its entirety, it will take about 30 minutes. But you can skip any story, and it should still make sense. The first half are my student stories, and the second half are my professor stories, so you can even just read the half that's more interesting to you. This would probably be more enticing as a series of Tweets or a Substack newsletter, but I'd rather post it all at once.

I'd love to read the backstories of other peoples' publications too. So if you feel comfortable sharing, post yours and email me if you want it linked at the bottom of this page so we can start a collection. Anyways, here goes.

Part 1: Papers as a student author

I first stumbled upon research in September 2003 while I was a second-year undergraduate student. My inbox popped up the message, "call me or track me down in class, 'cause a full explanation will take a lot of typing. -Bo".

Bo was a friend who needed some programming help for an automated help generator as part of a research project, and off I went. What I thought would be a one week task turned into three rewrites of an application, two studies, and one long faculty mentorship. Neither of us could predict that Bo would graduate before the main study began, so he did not even become a co-author of a project he started (though in hindsight, I could have just put him as second author).

Because I was learning as I went, I made all the novice mistakes from study design to paper structure. It was a long 3.5 years between when I got that first message and when it was published, after my graduation. And the acceptance decision came only after eeking out a lucky coin-flip, as the initial metareview score was "3-Borderline".

But this paper was critical to getting me accepted to a Ph.D. program. Why do I think that? Well I was rejected by every Ph.D. program I applied to before this publication (but that's another story). So with this paper published, I left my job, squeezed my belongings into my car, and drove up the I-5 highway from California to Seattle.

Graphstract: Minimal Graphical Help for Computers. Jeff Huang, Michael Twidale. UIST 2007.

My next paper was sadly my first and only full paper with my original advisor, Efthi, before he passed. I don't remember coming up with this idea to study query reformulations, so it must have been he who led me to it. It was intended to be something easy but original—a starter project analyzing the sequences in existing data released by AOL.

It was rejected from SIGIR on my first try, and I was starting to worry the topic would become stale, especially with the controversy about the original dataset that led to AOL closing their research department. So I was relieved this wrapped up quickly, and it felt good to have a paper under my belt in my first year.

It puzzles me that this is my most cited full paper, but I think it's due to its topic rather than because of its contribution; though when I checked recently, a few citations are from people using the source code even 12 years later. I guess there's something to be said about the compound interest for citations.

Analyzing and Evaluating Query Reformulation Strategies in Web Search Logs. Jeff Huang, Efthimis Efthimiadis. CIKM 2009.

I met Ryen from Microsoft Research when he was a visiting speaker in Efthi's class, and he must have found my internship application that mentioned my paper about query reformulations. Well it turned out he was interested in search trails, which are essentially a series of reformulations, so I was able to continue along that line of work.

The gold standard for these internships was to do a paper from start to finish in the twelve weeks, and I was desperately trying to do paper-worthy work to prove myself. I carried out the analysis as best I could, however I struggled to do enough for a full paper by the end of the internship. Fortunately, Ryen finished up and expanded on it substantially after I left, so I'm grateful he didn't just give it up as unfinished.

To my complete surprise, this paper won the best paper award that year at SIGIR. Even after being nominated for the award, I felt it was so unlikely to win that I didn't attend the conference. In fact, at that time I thought my other related paper at that conference (the one below) was a better paper overall, but now I see that the best paper award committee probably felt that evaluation was a messier topic so wanted to reward that effort. This award boosted my confidence during my Ph.D. and opened some doors later, so I'm both lucky and thankful it happened.

Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs. Ryen White, Jeff Huang. SIGIR 2010.

This related paper came out of the same summer internship with Ryen. I was lucky to be included in the author list because I had a small role and never even met the first author. Besides my initial work over the summer that was moreso for the paper above this, I contributed only a paragraph of original writing. After reading this paper, I felt like it was better than the one I had contributed more, but it has received fewer than half the citations.

Studying Trailfinding Algorithms for Enhanced Web Search. Adish Singla, Ryen White, Jeff Huang. SIGIR 2010.

I met a brilliant intern during that same summer, Anna, who was doing her first of two Math PhDs at that time and often seemed idle. She explained, "the people who are always busy never seem to get much done." So sensing her free time, I asked her to help solve a made-up hypothetical problem that simplified assumptions from the information retrieval community.

We had a good time working out the math, mostly me asking questions while watching her think aloud on the whiteboards. I learned about different strategies for deriving proofs in a real setting, which were unlike problem sets, where you knew a proof existed and were roughly the right difficulty for you to do. During this summer, I was mathematically in my best shape, as I could follow along enough to write out the solution and check for errors, whereas today it would take a while to even familiarize myself with the equations in the paper.

Optimal Strategies for Reviewing Search Results. Jeff Huang, Anna Kazeykina. AAAI 2010.

After the summer, I was looking for new ideas to pursue outside my usual information retrieval topics. A friend in the Ph.D. program Gifford came to me with the suggestion that watching people playing video games was an overlooked phenomenon. But no one was paying attention. At that time, you could really only watch low-resolution videos of Korean players with amateur commentary dubbed in English.

That idea clicked with me, so I helped out with a qualitative study to understand why. Gifford taught me most of what I know about grounded theory (and gave me a second lesson a few years ago while collaborating on a more recent 2018 paper).

At that time, I had in my mind that I was helping him rescue a rejected idea, but now in retrospect, it's clear that it was he who gave me the opportunity to help out on work he already had a vision for. I'm quite proud of this work, which is my third most-cited paper, but mainly because what we thought would become a phenomenon did indeed come to pass.

Starcraft from the Stands: Understanding the Game Spectator. Gifford Cheung, Jeff Huang. CHI 2011.

This became the most important paper during my Ph.D., and it was minutes from not happening. The idea was one of three that I tossed around during my interview to be a Research Intern at Bing. But to apply this idea to the search engine required a complex series of software integration steps that I was not too familiar with; yet it had to be committed by deadlines that occurred every two weeks.

I aimed for a commit deadline during the second half of my internship, and the day it had to ship by 5pm, my code simply wouldn't pass the unit tests. I tried to force it by (naively) changing the unit tests, but that just broke other parts of the build process. It was 4pm and I even had to be somewhere else across the bridge in 30 minutes; I was desperate. I ran in the hallways panicking and found a software developer, Sarvesh, who took a look and gave me some tips. But by the time I had to leave, I still could not push the code. Sarvesh again came to my rescue and assured me, "you take off, I've got this" and sat down at my desk to fix the problem and ship my code while I drove out of the parking lot. Without his help, I would have had to ship it during the next cycle, and would not have left enough time to analyze the data before my internship was over. Not only that, but because it was during a corporate internship, I would have no rights to the intellectual property of any of the work.

But my 20 lines of JavaScript did ship and luckily my was bug-free (after I pored over every line a hundred times), so this paper became the first paper I published at the main conference in my field; it was nominated for a best paper award, and became the foundational chapter in my dissertation. Not only that, but it was this work that led to a Google Research Grant, Facebook Ph.D. Fellowship, and Microsoft patent.

No Clicks, No Problem: Using Cursor Movements to Understand and Improve Search. Jeff Huang, Ryen White, Susan Dumais. CHI 2011.

Back at school, and with momentum from our previous paper about gaming, Gifford and I looked at one of his rejected papers about Texas Hold'em, and expanded it with data from another game, Halo 2. We repeated our formula of qualitative analysis, and submitted to CSCW. It received borderline ratings from our reviewers, but CSCW was experimenting with a "revise and resubmit" process that we were lucky to go through.

Remix and Play: Lessons from Rule Variants in Texas Hold'em and Halo 2. Gifford Cheung, Jeff Huang. CSCW 2012.

At this point, one of my friends from the summer internship, Abdi, felt like I had "the magic touch" so I offered to help out with a rejected internship paper. I only helped brainstorm and edit, but it did end up getting accepted without much fanfare. While this one worked out, a year later I tried to rescue another paper with him, but it only ended up in the bin. Same effort, opposite outcomes.

Interactive Search Support for Difficult Web Queries. Abdigani Diriye, Giridhar Kumaran, Jeff Huang. ECIR 2012.

Now it was the end of my third year, and I felt guilty that I only had one dissertation chapter ready. I had co-founded a startup as part of Techstars Seattle while teaching a class in the evening, so those things had occupied all of my time. Our startup eventually ran out of money and my co-founders left for other opportunities, so in the summer I returned to Microsoft Research for my third internship. Again I joined a different group, this one managed by Sue who a year later became my Ph.D. advisor.

Her group was probably the closest to my core research interests, and it just happened that Ryen had moved to this group. So with this combination of good circumstances, I aimed to write three papers with Ryen and others to get enough material for my dissertation.

This first one was difficult for me, using a technique I was unfamiliar with, and required a lot of compute. My compute jobs would compete with higher priority jobs during the day so often timed out. So I had to stay many late evenings to kick off the 8-hour distributed processes that could only finish at night when the cluster was not in heavy use. My efforts paid off; reviewers were generally favorable and it was a clean accept.

Now I have to confess, this is the paper I am most skeptical about. I don't fully trust the model, even though I kept checking it over. The results showed a modest improvement, but I can't seem to shake the feeling that they were due to a secondary factor like a collinearity, or even worse that there was a calculation error somewhere in there. But now my code is probably gone, and it seems like others were able to show practical improvements using similar models.

Improving Searcher Models Using Mouse Cursor Activity. Jeff Huang, Ryen White, Georg Buscher, Kuansan Wang. SIGIR 2012.

A Bing employee, Georg, had collected a nice dataset from a study for one purpose, but it seemed like it could be used to study something closer to what I had done before. He graciously lent me the data, and I did enough analysis to produce this paper. However, one reviewer felt that the descriptive results were not that novel (just a bigger study), nor were the predictive analysis that successful, summarizing "with their paper they now try to add more to this field, but I don't see the important contribution that would justify the publication at CHI." The opinions were ultimately divided.

I sweated over the rebuttal and promised changes, and luckily convinced the metareviewers to let this one slip through. But it actually turned out to be a fairly influential paper, with 223 citations as of today, and served as a baseline for some of my students' work later. So this was paper number two from that summer. I think it hit a trend of papers about user attention that came out the next few years, but this trend has dwindled since then.

User See, User Point: Gaze and Cursor Alignment in Web Search. Jeff Huang, Ryen White, Georg Buscher. CHI 2012.

The third paper of that summer was an easier analysis with a novel idea—that sophisticated use of browser tabs was a central part to finding information. I recruited another student from my university, Tom, who happened to be there interning the same summer. The metareviewer remarked quite correctly, "The paper is not rocket science but [...] to my knowledge has not really been looked at, at least not at this large scale."

While browser tabs did continue to be a phenomenon, this paper got fewer citations than a lighter short paper I wrote earlier on this topic. I think partly because the title was too clever to be easily recognizable.

No Search Result Left Behind: Branching Behavior with Browser Tabs. Jeff Huang, Thomas Lin, Ryen White. WSDM 2012.

As a bonus for the summer, Georg was writing his own paper, and I had a minor contribution that he deemed enough for a co-authorship. It got a decent number of citations and filled a nice gap in the research field, as well as in my dissertation. So it became a lucky 4-paper summer. I was exhausted by the end of it and took a nice long break.

Large-Scale Analysis of Individual and Task Differences in Search Result Page Examination Strategies. Georg Buscher, Ryen White, Susan Dumais, Jeff Huang. WSDM 2012.

Besides the two papers with Gifford about gaming, I wasn't doing any research during the school year, as all my other papers were from internships. Without an advisor, I served as a teaching assistant every quarter (sometimes for two or even three classes at once), and got a bit distracted with some other activities.

But one day my opportunity to be a TA vanished (itself another story), and I begged over to Oren, a professor from the computer science department for an office and funding. To my surprise, he immediately agreed within hours of my email. So I started to learn about natural language processing and got to see how he ran his lab.

This resulting paper was a combination of his interests and mine. A couple of undergraduate students joined in under my supervision, which was also my first time mentoring students. The study almost did not happen, because I had trouble fitting the procedure under the rules of our human subjects guidance. But after last-minute discussions with Oren and an HCI professor, we found a way to thread the needle.

Its publication helped launch one of the undergraduate students into the Ph.D. program at MIT, and led to my interest in involving undergraduate students in research for many years to come.

RevMiner: An Extractive Interface for Navigating Reviews on a Smartphone. Jeff Huang, Oren Etzioni, Luke Zettlemoyer, Kevin Clark, Christian Lee. UIST 2012.

I wasn't originally planning to do another research internship at Microsoft in my last summer, but Tom and Nachi reached out to me about an opportunity I couldn't say no to—a summer to study gaming with large-scale Xbox data. So in my last summer, I joined their gaming research initiative. It was a collaboration between Microsoft Research and folks from Xbox, and along the way I learned a few techniques for time-series analysis.

I barely finished the final analysis in the paper the day that I had my farewell lunch. Reviewers loved the work more than I expected, and this paper led to a few opportunities later so I'm glad it worked out. It also brought closure to my Ph.D., as I ended up with no leftover working papers in the pipeline, hence the 2-year gap until my next paper.

Mastering the Art of War: How Patterns of Gameplay Influence Skill in Halo. Jeff Huang, Thomas Zimmermann, Nachiappan Nagappan, Charles Harrison, Bruce Phillips. CHI 2013.

Part 2: Papers as a faculty author

Up to this point, I was feeling comfortable leading a paper from start to finish, but the job changed when I became faculty. While I could come up with the idea and advise the process, the initial drafts would be written by students.

So fast forward past my move to Providence and a few false starts at unfinished projects. My first published paper as a faculty member came from a student referred by a professor at another university. The student was Eddie, an undergraduate student from UCLA who reached out to the professor about conducting research analyzing patterns in StarCraft replays. That professor thought I fit the topic better but cautioned, "he [I] probably doesn't have the bandwidth to supervise external students at this time". While that would probably be true now, back then I took the chance and steered him towards an adjacent investigation.

I invited Gifford (yes, my classmate from before) to help out, and the work from start to finish was about 8 months of intense analysis and figure-making. The paper ended up with two strong ratings (4.5/5, 5/5) and two unenthusiastic ratings (2.5/5, 3/5), so the compromise was that it ended up shepherded (a paper deemed borderline but asked to make specific changes to be acceptable) to guide us to "accept". This made us nervous for longer, but after this paper got in, I wrote to Eddie, "You've earned your golden ticket to grad school :-) congrats!" and he chose to do a Ph.D. at the University of Washington, my own alma mater.

Masters of Control: Behavioral Patterns of Simultaneous Unit Group Manipulation in StarCraft 2. Eddie Yan, Jeff Huang, Gifford Cheung. CHI 2015.

While teaching my graduate seminar, I was overwhelmed by the enthusiasm of the students so I started assigning projects that students could recycle into research. This became the first of a few papers that were born from class projects in my HCI seminars. Each student worked on their own mini-study, and we combined it into a meta analysis, which is a formula that worked for a couple more papers in later years as well.

The timing for this particular paper was a bit lucky because the reviewers nominated it for a best paper award, but our follow-up work was not as successful; we still had more to say on this topic, but met a lot of resistance in writing the sequels after years of trying to publish newer findings with only rejections.

Crowdsourcing from Scratch: A Pragmatic Experiment in Data Collection by Novice Requesters. Alexandra Papoutsaki, Hua Guo, Danae Metaxa-Kakavouli, Connor Gramazio, Jeff Rasley, Wenting Xie, Guan Wang, Jeff Huang. HCOMP 2015.

I led my first Ph.D. student Alexandra in a few unfruitful directions trying to continue ideas from my Ph.D. that ended up with two unpublished papers worth of work (so I'm grateful for her patience). But one day while laying in bed I realized we could flip the story from estimating attention with the cursor, to using the webcam while the cursor auto-calibrates the webcam model during regular web browsing. We could deploy this as a library, basically shipping a product.

This paper was the first of many product-style papers that have become the norm in our research group. The work was initially rejected at multiple conferences because while the overall system was effective and the functionality was novel, the technique was not innovative and the results were numerically worse than some of our competitors. I was frustrated about being compared against competitors who only reported data from the users for whom they get good results from (even when they are upfront about omitting results from most of their users), while we were reporting full results from every user.

Anyways, it took over two years to build and publish, but I'm proud that the system in our paper is used by a sizable community. It has become part of a popular psychology library used for many research studies, and adopted by a few startups including one which bought a non-exclusive commercial license. We knew this work would have impact later, as Alexandra sent me one of my favorite acceptance notifications, "It got in!!!! I am going back to sleep, I'll email the rest of the authors tomorrow! :D Very excited, Alexandra" This paper was the foundation for her dissertation, and we are both still working on the project now seven years since it began.

WebGazer: Scalable Webcam Eye Tracking Using User Interactions. Alexandra Papoutsaki, Patsorn Sangkloy, James Laskey, Nediyana Daskalova, Jeff Huang, James Hays. IJCAI 2016.

I admitted a second student, Nedi, to our Ph.D. program who had prior work on sleep diary research during her undergrad. My brainstorming notes in the months before she arrived was, "we will use data-driven techniques across a large populations sleep data to make (personalized) prescriptive sleep recommendations". By coincidence, I met some new collaborators who had clinical research expertise for this, so Nedi and her team of research assistants set out with these collaborators to develop the software and the study.

We had a tough start and ate a few rejections at both UIST and CHI before we published the paper at the following UIST, 2 years after the initial idea. Even then, the paper almost didn't happen because the reviewers were skeptical (borderline ratings) but Nedi wrote a convincing rebuttal, as a metareviewer summarized, "I re-read the paper in light of the rebuttal. The proposed changes [...] pushed me into the slightly positive end of the spectrum. The submission was discussed at length at the PC meeting and received additional input from another PC member who reviewed the submission at a prior venue. The overall feeling is that this isn't a perfect paper, but it is a difficult area in which to do research and we do learn something from the submission."

Close call, but this became the foundation for the rest of Nedi's Ph.D. work. We were lucky to publish it sooner than later because I later found out there were other research groups working on similar ideas.

SleepCoacher: A Personalized Automated Self-Experimentation System for Sleep Recommendations. Nediyana Daskalova, Danae Metaxa-Kakavouli, Adrienne Tran, Nicole Nugent, Julie Boergers, John McGeary, Jeff Huang. UIST 2016.

This paper followed from Alexandra's previous one, so was a bit more straightforward as a follow-on application to our previous work. Reviewers like it, and it received an honorable mention. I wish I had more to say, but it was one of the rare times where the idea was universally agreeable and the results were as expected. Later I learned that during my tenure application, an external letter writer remarked that I didn't have many follow-up papers (sequels) at that time, in fact just this one.

SearchGazer: Scalable Webcam Eye Tracking for Remote Studies of Web Search. Alexandra Papoutsaki, James Laskey, Jeff Huang. CHIIR 2017.

Work on this paper started mid-2015, led by Shaun, a Masters student who became a Ph.D. student later. It was a fairly complex product so it took the team substantial time to build it out, and the paper was not published until 2 years later. What's nice is the paper had some broad impacts: two of the undergraduate students working on it are in Ph.D. programs now, and there are still active users of the online web application 6 years since the work started.

This paper set the standard that we would try to include undergraduate students in every paper, and so far that still holds true—100% of the papers from our group have included undergraduate authors.

Drafty: Enlisting Users to be Editors who Maintain Structured Data. Shaun Wallace, Lucy van Kleunen, Marianne Aubin-Le Quere, Abraham Peterkin, Yirui Huang, Jeff Huang. HCOMP 2017.

We followed a similar formula as before, having students in the HCI seminar run mini-studies, which became a meta study for this paper. But things were not so easy this time around, as the first version of the paper with one cohort struggled to reveal enough compelling findings. So we had to develop a new procedure for students in the seminar in another year, and combined the results from both cohorts for the submission.

We submitted to CHI 2017 and while two reviewers rated it highly (4.5/5 and 4/5), the third wrote a scathing review; the two metareviewers examined the paper closely, and ultimately decided to reject. It was a little frustrating to be so close, as this paper had the highest average rating of all the rejected papers that year. However, we revised it and ultimately published it at IMWUT the following year after a cycle of major revisions.

Lessons Learned from Two Cohorts of Personal Informatics Self-Experiments. Nediyana Daskalova, Karthik Desingh, Alexandra Papoutsaki, Diane Schulze, Han Sha, Jeff Huang. IMWUT 2017.

This paper was an exhausting amount of work for Alexandra, collecting a high-quality dataset with the hopes to release it as a contribution. The setup, lengthy procedure, and large number of participants were meant to provide stronger validity for the general topic of eye tracking during interactions.

However, even after the data collection, there were a few snags. We learned that video frames did not inherently have timestamps associated with them, and it was nearly impossible to retroactively infer them to millisecond-level accuracy. While the dataset itself still felt like a strong contribution in the end, it was harder for other researchers to apply immediately so hasn't been as broadly used as I hoped. I still feel like this paper is a bit underrated today, and someone could write one or two other papers from the dataset we collected.

The Eye of the Typer: A Benchmark and Analysis of Gaze Behavior during Typing. Alexandra Papoutsaki, Aaron Gokaslan, James Tompkin, Yuze He, Jeff Huang. ETRA 2018.

My most recent Ph.D. admit, Jing, came with a unique design and technical background, so started working on an ambitious virtual reality project. However, that idea had trouble producing consistent results in practice, but I noted in my research journal on October 2016 that the "motion movement physical device that Jing built that can be used for replay too". So we pivoted to building the product for a different use case which became this paper.

The paper was accepted on its second submission and one of the rare times I've encountered an "accept" decision without having to do a major revisions beforehand. But as a product it has been disappointing; we attempted to deploy it to usability professionals, the target audience, but it turns out that very few people were willing or capable of 3D-printing their own components. We learned from this so the project has not ended here, and we are nearly finished with a sequel, 5 years after the initial idea to reach our original vision.

Remotion: A Motion-Based Capture and Replay Platform of Mobile Device Interaction for Remote Usability Testing. Jing Qian, Arielle Chapin, Alexandra Papoutsaki, Fumeng Yang, Klaas Nelissen, Jeff Huang. IMWUT 2018.

After meeting with Eda, a Masters student who wanted to work on a research project with me, the first note in my research journal was "discussed rewind: cool idea but low chance of publishing". I had no idea how true that would be, as this became the most challenging paper my group would publish. What started in January 2015 was published December 2018, 4 years later, and was passed from student to student after each one graduated. I lost count of the number of rejections.

Many of the authors had never met each other, and it was a bittersweet moment for me when by pure coincidence, the original author, Eda, was standing in the hallway outside my office with the final author, Neilly, without knowing one other (which I immediately corrected by introducing them).

What was challenging about this paper was the engineering work used existing known techniques, so we had to emphasize how the experience was a contribution on its own. This was hard to do in a study, as it wasn't about directly improving any specific aspect of life, but being able to experience it differently. I begged my old colleague Gifford to help in early 2018, and what put it over the finish line was a careful mixed methods descriptive writing based on the detailed analysis Gifford directed.

Rewind: Automatically Reconstructing Everyday Memories with First-Person Perspectives. Neille-Ann Tan, Han Sha, Eda Celen, Phucanh Tran, Kelly Wang, Gifford Cheung, Philip Hinch, Jeff Huang. IMWUT 2018.

This was the first time I had been involved in a project with both hardware and software components, so the system itself took longer than expected, and we were writing this paper down to the deadline. The paper had to be carefully crafted to describe a complex configuration, with 3D-printed parts, augmented reality, cameras and sensors, computer vision, heating and energy issues, and both wired and wireless networking.

Reviews were mixed, but we thankfully had support from our metareviewer, "I am looking forward to hopefully a strong rebuttal so I can be your advocate at the UIST 2019 PC meeting." This encouragement was exactly what we needed in that moment.

The effort was worth it, because this system led to a few other projects, and serves as a foundational paper for Jing's Ph.D. What we are still trying to figure out today is how to deploy this as a product to regular people, as the hardware requirements again posed a barrier for adoption.

Portal-ble: Intuitive Free-Hand Manipulation in Unbounded Smartphone-based Augmented Reality. Jing Qian, Jiaju Ma, Xiangyu Li, Benjamin Attal, Haoming Lai, James Tompkin, John Hughes, Jeff Huang. UIST 2019.

Nedi turned her earlier SleepCoacher paper into a fully automated process, completing our vision from the seed of an idea 6 years ago. The product described in this paper was shipped to the app stores and used by whoever would come across and download it, basically real usage by people we did not recruit. We maintained Android and iOS apps separately, and a server to do the calculations. It was a costly mistake to build out three separate systems; we should have started with something cross-platform and performed the calculations in the app itself to reduce the software maintenance from three systems to one.

The paper was hard to publish, because unlike recruited and paid participants, our 5,000 app store installs (now 7,700) led to messy data—a lot of people never opened the app, or did so only once. Reviewers were unimpressed that thousands of installs only led to a couple hundred active participants, of which only about a fifth of the users tracked for enough nights to get useful information.

In the end, it was a close decision but the CHI program committee decided that it could be acceptable if shepherded, "I am still leaning positive given the difficulty of the method, the importance of the topic, and complexity of the project as a whole." Being on the program committee myself that year, I wondered to another faculty member why our papers always seem to only barely get in, and they responded matter-of-factly, "all accepted papers barely get in," referring to the declining average scores at CHI over the years.

SleepBandits: Guided Flexible Self-Experiments for Sleep. Nediyana Daskalova, Jina Yoon, Lisa Wang, Cintia Araujo, Guillermo Beltran, Nicole Nugent, John McGeary, Joseph Jay Williams, Jeff Huang. CHI 2020.

This idea grew out of my NSF CAREER Award as the required educational component, where I proposed a classroom tool for large-class simultaneous design activities. But the code was written and rewritten several times by different teams even after that, because running a real-time collaborative system with over 100 simultaneous users introduced its own share of problems.

There were many nervous moments leading up to each attempt. The tool failed the first semester or two that we tried it; the server would crash or some of the data would be lost or corrupted, and we would lose our chance to get data that semester.

Submission-wise, reviewers always wanted more, so the paper itself went through multiple different narratives before being accepted in a tough revise and resubmit cycle. The mood around this revise and resubmit is best portrayed by a reviewer, "I find this to be a mostly solidly executed project, but I don't see a substantial contribution to CSCW / creativity support tools here. I am not sure if this can be fixed in a revision cycle, but if the authors are keen, ..."

Well, we were keen.

Sketchy: Drawing Inspiration from the Crowd. Shaun Wallace, Brendan Le, Luis Leiva, Aman Haq, Ari Kintisch, Gabrielle Bufrem, Linda Chang, Jeff Huang. CSCW 2020.

In 2014, I met with a faculty member in Psychiatry and Human Behavior to discuss a collaboration, where we concluded that the area of mental health lacked innovative computational techniques. I did a rough prototype of our initial idea in 2015, then students picked it up the following year. The work was passed between teams of students, mainly supporting the application development and real usage from a growing list of collaborators. It turns out that many clinical researchers felt the same, and this became our most funded project.

However, the papers took a while and the one led by my group was rejected throughout 2018 and 2019. The hard part was because while we were using a fairly unique approach, I didn't have much experience writing about this topic. Reviewers would have varying opinions of what needed to change, so the text would waffle back and forth; we finally arrived at a version of the paper that was satisfying enough, and it was accepted for publication in 2020.

About 20 people were involved at different points in time, but the study didn't start until 2017 so the paper itself was about 3 years in preparation. I feel like the overall goal of computational interventions for mental health is a good one, but it feels like we've only taken a small step.

Sochiatrist: Signals of Affect in Messaging Data. Talie Massachi, Grant Fong, Varun Mathur, Sachin Pendse, Gabriela Hoefer, Jessica Fu, Chong Wang, Nikita Ramoji, Nicole Nugent, Megan Ranney, Daniel Dickstein, Michael Armey, Ellie Pavlick, Jeff Huang. CSCW 2020.

We had been wanting to expand Nedi's SleepBandits app beyond sleep to all sorts of self-experimentation, and finally got a chance when NSF extended one of my expiring grants to fund this paper.

Our first attempt to publish at CHI 2020 was rejected partly due to weak findings, but Nedi had already been preparing a second study to complement what we had. While it still required major edits, CHI 2021 accepted the paper after a straightforward rebuttal. Did I finally end a long streak of struggling to publish? We considered this our easiest paper, but it still took an 8-person team nearly 30 months.

But maybe publishing faster or publishing more is not what it's about. I care more about this project as an app that people can use, so I rebuilt it in the cross-platform Flutter framework, with hopes to use it as a foundation for later work. Hopefully the papers are just a milestone towards people making their lives better through self-experimentation.

Self-E: Smartphone-Supported Guidance for Customizable Self-Experimentation. Nediyana Daskalova, Eindra Kyi, Kevin Ouyang, Arthur Borem, Sally Chen, Sung Hyun Park, Nicole Nugent, Jeff Huang. CHI 2021.

Introspection

After reviewing these notes, I'm a bit ambivalent. When I was a clueless student, I got lucky with my collaborators and acceptance decisions, which made all the difference. After becoming a professor, I had the experience yet the papers take even longer to publish.

Part of why is the focus on systems papers, which are known to take 3-4 years, but shipping them as products has been an even longer 4+ year agenda. Worth it, sure, but we'll never be like most groups that publish 5+ papers a year.

But I also think about how my group has encountered a lot of rejection, and keeping up morale was sometimes difficult. Bad news injects doubt and discouragement into students' minds, who then have to rally the team to continue the work in hope that acceptance is just around the corner. This dissonance is hard to manage.

The other thing I noticed is that papers that get the most citations later often got poor reviews or multiple rejections. They're usually about a new phenomenon, but the novelty can always be recast as "old thing, but just on the web" or "mostly engineering work, with so-so results". With this in mind, I should probably be generous in my own interpretation of what's novel when I review papers.

In retrospect, I am grateful for many key collaborators for their extra help in those times, and that some conferences like UIST accepted papers despite some obvious flaw because those papers ended up defining long-term research programs for multiple young researchers.

If you'd like to get updates about our current research projects, subscribe to your choice of project update emails.

Since I published this, I've been informed of a comprehensive series of backstories by Leslie Lamport, one backstory by Matthew Keeter, a history about papers for Legion, and a series of submission notes by Molly Nicholas.

Thanks to Alexandra Papoutsaki, Bo Lu, Gifford Cheung, Tongyu Zhou, and Zainab Iftikhar for their comments on earlier drafts.

Also in this series

This page is designed to last, a manifesto for preserving content on the web

Illustrative notes for obsessing over publishing aesthetics