How collaborative is collaborative writing?
An analysis of the production
|
| Version |
1 |
2 |
3 |
4 |
5 (final) |
| lines |
149 |
187 |
242 |
236 |
269 |
| paras |
50 |
64 |
70 |
63 |
72 |
| words |
1422 |
1808 |
2485 |
2457 |
2592 |
Table 1. Text details for all versions of the report
As can be seen from these data, the major changes occurred in between the first and third drafts during which time the document grew by almost 75% from 1422 words to 2485 words. Only a further growth of less than 5% occurred between the third and the fifth (final) draft. Interestingly a slight trimming occurred in document size between the third and fourth drafts, countering the idea that documents inexorably grow in size until the authors cannot think of anything new to write.
In order to gain insight into how the document developed beyond issues related to size, the various contributions made by each author were examined. This process was enabled by the previously agreed restriction on authors to identify their contributions by a unique font or style. In the present instance the primary author used plain times 12 pt, while the second and third authors used bold and italic versions of this font respectively.
Each modification was attributed, noted and listed. Then they were grouped according to similarity in terms of how they altered the document. This classification created major categories for text addition, deletion, correction, ordering and queries, most of which are self-explanatory. It is assumed that the primary activity in an authoring task is the creation or addition of text and this category is further broken down into types of addition. Those noted in the present study were the addition of signposts (e.g., headings or sentences aimed at orienting the reader), extensions (e.g., more detail or elaborated workings of existing sentences), generation (i.e., the creation of text based on new ideas or issues without altering the existing text, at least one paragraph in length), examples and wordings.
Each grouping was then assessed in terms of occurrences across drafts. The final classification of modifications is presented in Table 2.
__________________________________________________________________
Modification Type Version
1-2 2-3 3-4 4-5
__________________________________________________________________
Addition of new text: Signposts 1 0 0 1
Extensions 1 3 0 0
Generations 2 5 0 1
Examples 0 1 1 0
Wording 0 0 1 2
Error correction Spelling 0 0 1 0
Re-ordering 0 0 1 0
Queries Details 1 3 0 0
Objections 0 3 0 0
Clarification 0 4 1 0
___________________________________________________________________
Table 2. Text modifications for all document drafts
As can be seen, the majority of activity occurs as addition of new text. This is hardly surprising in itself, the real interest lies in the type of addition and its time of occurrence across the document development cycle. The most frequent additions were generations of new points and extensions of existing ones.
A generation, by definition must be at least a paragraph in length though it could be much more i.e., it must represent an independent text unit covering or dealing with a particular issue. Extensions on the other hand may only be a line or two in length and include only material that is an elaboration of the the exiting text with insufficient detail or length to justify describing it as a "generation".
As indicated in Table 1 the largest change in size occur between versions 2 and 3. This cycle is obviously the major growth period for the document. This size growth is matched by the modification details presented in Table 2 where it can be seen that 5 generations and 3 elaborations occur. Also interesting at this point is the number of queries that the authors have. Questions of detail, clarifications, and objections are raised with more frequency here than at any other stage. Why should this be?
The most likely explanation lies in the fact that the cycle that started with the second draft represented the primary author's first chance to respond to the other authors' comments on his preliminary draft. So the modifications between 1-2 represent all authors' first inputs, modifications termed "2-3" start the moment the first author alters the text on the first draft (with comments), circulates to other authors and ends prior to him responding to the others' subsequent comments on these modifications. Thus this cycle is the first truly interactive cycle where all authors can comment on all other authors' inputs.
The lack of deletions is also interesting and is probably best explained by the fact that the authors are all familiar with each other's ideas and writing style and were writing up a report on a subject with which they were familiar and had discussed their views on prior to the first draft. Even so, it is surprising that no deletions were marked in any versions of the text (the drop in size in versions 3-4 reflects a major rewording that reduced two lengthy sentences to one). The absence of spelling errors in the main results from the use of spell checkers.
Discussions amongst the authors afterwards indicated that they all felt the writing process to have been typical of their normal style and had not found the idea of using unique text formats or the knowledge that the production was being monitored, intrusive. This team has used hypertext to produce an academic paper (using GUIDE) and are keen to try the process again. Like Trigg and Suchman (1989) they found that the centralised workspace and single working copy of a document was useful but they still used paper printouts to work on and created some text for that document on their usual machines with word processors before copying it into the centralised hypertext document. Their experiences using a shared workspace had not encouraged them to write in this fashion subsequently.
Three authors all working on a CEC-funded multi-partner project contributed to this document. Two were human factors specialists working in academia, the third was the project manager, working for a management consultancy company who had a background in software engineering.
As a result of a perceived need for a policy document on issues associated with the project two authors agreed to write it (including the present author). By the time the document was finished the author list had grown to three (two from academia, one from the partner site). The academic authors used equipment identical to that mentioned above. Situated in different offices they could pass files over the network (using Public Folder on the Mac) and had the services of a project secretary using identical equipment. The author at the second site had IBM PCs but normally opted to use WordStar on a portable Toshiba. He also had the services of an on-site secretary.
Unlike the previous study this was the first time the authors had written together. Furthermore two of the authors did not realise that this process was to be analysed by the third author therefore precluding the use of font type and size as identifiers of author's inputs. Both authors have subsequently been informed of this and not objected to analysis or publication of the data. To maintain the possibility of input identification, the "experimenter" kept copies of each author's drafts and noted alterations by direct comparison of drafts. As it turned out this process was simplified by developments in the authoring process.
A pre-draft meeting between three potential authors was arranged to discuss the document and identify the key points that it should include. Those present were two participants from academia and the partner from the remote site. Of the academic participants, one became primary author of the document, the other failed to play any further role in text production and will be discounted from subsequent analyses. Her place on the authoring team was taken by another academic at a later stage in the writing process. The structure of the final authoring team was: primary author (senior researcher), second author (Consortium Project Manager, based at remote site) and third author (local academic team manager).
The initial meeting resulted in the development of an outline in the forms of section headings and likely issues to cover. These were agreed by discussion and resulted in a sketchy framework, indicating four major sections and suggesting the type of material to be included in each one. Crucial issues at this stage were the identification of audience requirements and acceptability criteria, matters to avoid discussing and the perceived purpose of the document. Central inputs were written comments made by project reviewers and knowledge of accepted views on such policy matters. The timescale was agreed by reference to project demands and existing deliverable and workpackage deadlines, taking account of annual leave plans and likely future project meetings that would require this document as input. This gave a timescale of 10 days for the first draft, and 40 days to complete. Then, as before, one person was charged with writing the first draft. In this case, the primary author (the most junior of the final three authors) volunteered for the task. No information technology was employed at this stage, all planning occurred on paper and whiteboard.
When this draft was produced by the primary author (using his word processor - Word 4.0 on the Macintosh) it was sent by fax to the second author who worked at the remote site. The use of fax was requested by the receiving author as he had no reliable e-mail connection and e-mail was felt to be of uncertain value by both authors who felt that it might "lose" some of the text and/or require too much re-formatting of the document each time it was sent. Since this meant he had only a paper copy of the document he chose not to modify the text electronically. Instead, he scribbled some comments on the fax itself and produced several pages of text on his portable computer, outlining all modifications he wanted to make. These pages along with the original but marked-up first draft were faxed back to the primary author. This, as before, shall be referred to as the 1-2 modification stage.
The document was then modified by the first author again in the light of the second authors' comments using the fax as a marked-up manuscript with which to modify the original electronic version. This shall be referred to as the 2-3 modification stage and consisted of including and reacting to the comments of the second author. In other words the activities of text correction and creation occurred in parallel at this stage. After this it was given to a third author (the local project team manager) who heretofore had not been involved, to modify and make suggestions. He commented by using a pencil to edit the printout from the previous stage. His comments were accepted by the first author who gave them in the form of the marked-up manuscript to the project secretary to add and then format the final document. This represented an unusual political inversion in that the final say in editing lay with the most junior (in organsiational rank) member of the authoring team.
As before each version was measured in terms of line, paragraph and word numbers. For the hand modified versions this involved typing the modifications into the electronic version before calculation. These data are presented in Table 3.
Here we can see a major growth between versions 1 and 2, followed by a trimming of the document by its third version and another major growth again by version four. Unlike the previous document, each version here represented only the efforts of one author in that the first draft was produced by the primary author, the second version was one author's responses and suggestions to this, the third was the primary author's reaction to these modifications and subsequent re-writes, while the fourth version was the product of the third author's comments and modifications to version three as approved by the primary author. These labels are the participants' own referencing system.
| Version |
1 |
2 |
3 |
4 |
| lines |
271 |
273 |
267 |
274 |
| paras |
25 |
21 |
23 |
25 |
| words |
2125 |
2311 |
2155 |
2351 |
Table 3. Text details for all versions of the policy document
As before, all drafts were examined and all modifications noted. These were then grouped according to general type whereupon it soon emerged that the range of modifications was greater in this document . The classification system was therefore adjusted to take account of these, though where similar activities occurred the same labels were used to aid comparison. It should be noted that it is not the intention of the present author to produce a robust classification to support all such analyses, those used here represent first attempts at each data set, much more work is required to produce a reliable and valid classification for all such data. The modifications made to this document are thus shown in Table 4.
It is immediately obvious that this report was subject to more modifications than the previous document. Particularly noticeable are the large number of rewordings, classified here according to size, with large referring to at least a paragraph, medium to anything more than a sentence but less than a paragraph and small as rewording at the sub-sentence level. In particular, the modifications from version 1-2, when the author at the remote site modified the first draft, involve 24 rewordings of all sizes. Yet this author did not produce a single text generation (i.e., new text unit of at least a paragraph in size) indicating a more reactive than generative writing style on his part. Only small rewordings occurred in subsequent drafts.
The most frequent additions are extensions (i.e., elaborations of existing ideas up to a paragraph in length). Interestingly the largest number of extensions come from the final author who provided 8 extensions as well as 2 generations. The bias towards extensions over generations runs counter to the trend in the previous document study highlighting perhaps an unwillingness/inability on the less cohesive team's part to create original text (an unfortunately speculative hypothesis that might be explicable in other terms such as perceived lack of time or high agreement between authors on the sufficiency of the first draft).
Deletions were also more frequent in this study than the previous one. Classified into large medium or small according to the same criteria as the rewordings, these occurred mainly in version 2-3, i.e, were largely the work of the primary author in his response to the modifications of the remote author. However it must be noted that some of these involved deleting text that originally asked questions or sought extensions from the remote author on the first draft that he had failed to supply or answer adequately.
_____________________________________________________________
Modification Type Version
1-2 2-3 3-4
_____________________________________________________________
Addition of new text: Signposts 4 1 0
Extensions 3 2 8
Generations 0 2 2
Examples 1 0 0
Deletion Large 1 0 0
Medium 0 3 0
Small 1 0 0
Re-wordings Large 5 0 0
Medium 2 0 0
Small 17 2 3
Error correction Point/wording 1 3 1
Typos 0 5 2
Queries Discussion points 2 0 0
Questions 1 0 0
Answers 3 0 0
Objections 1 0 0
Suggestions 4 0 0
_____________________________________________________________
Table 4. Modification types for second document
The number and type of queries in this document was also greater than the first. The primary cause of this seems to have been the inclusion of many questions or suggestions for points to be covered in the first and second drafts which were either answered or removed by the subsequent author. By the time the third draft was produced all such queries were resolved. Typical queries were of the form:
"Need to state something about X's role in this work?"
or:
"Is it worth saying this here, could we leave it out? The decision is yours."
Not all issues were resolved. In two instances a failure to elicit a response from another author to a question resulted in the original author dropping the point altogether.
Three interesting issues emerge from these analyses. The first is the extent to which there are similarities and differences between the two authoring teams. The second is the extent to which collaboration existed and could be supported in these writing scenarios. Finally, the issue of generalisability of these results warrants attention. This section addresses these issues in turn.
The general sequence of activities involved in producing these texts was similar for both groups. First there was the identification of the need for such a document (either imposed e.g., as a project requirement in study 1, or agreed e.g., on the basis of on-going group discussions in study 2). This led to pre-draft discussions. These discussions included various inputs in the form of other documents, notes, views, knowledge, data and so forth. A primary author was "identified"; a complex process influenced by group politics, ratings of others' intellectual strengths and weaknesses, time availability and willingness of participants to take a leading role. The primary author produced a draft according to an agreed plan, others responded, and the task was completed when agreement or time limits were reached.
Both teams relied heavily on the first draft and in each account, the producer of this draft acted as a type of "gatekeeper" to the document's contents. In other words, the primary author had the ability to make certain adjustments to the suggestions and modifications of the other authors by virtue of his position in the cycle. Though no formal editorial position was granted to an author in either team, the "gatekeeper" largely controlled inputs and modifications. For example, the first team passed all their modifications cyclically back to this person who modified each draft and re-released it. In producing a second draft he had to respond to any queries raised by other authors and incorporate any suggestions they made that did not neatly slot into the existing text. In the second team, the "gatekeeper" acted as a collection point for other authors' comments with the net effect that authors two and three never actually communicated directly with each other and author two never directly received author three's comments. Under such circumstances it becomes increasingly difficult for non-primary authors to monitor the progress of their own inputs as the cycles progress and technology may possibly offer a solution to this potential problem.
The quality of the first draft in each case seems determined largely by the quality of the preceding discussions between the authoring team and the ability of the primary author to reflect these ideas and points in his initial draft. In both case studies such discussions resulted in an explicit proposal for the document's structure and likely content type for each section. For the first team the subject matter and structure resulted from shared consultancy experiences and pooled interpretations carried out over several weeks as well as the contents of earlier reports, therefore ensuring that most disagreements between participants had been ironed out by the time the first draft of the final report was produced.
In the second team, the contents of the first draft were determined by discussions between the authors at the initial meeting. As stated above, this meeting did not involve all three of the final authors of the document. Lasting the best part of a working day, the meeting ranged over numerous issues that were irrelevant to the task of document production or the report's contents. The decisions taken at this meeting meant that the primary author had a large degree of autonomy in producing the first draft. It is perhaps surprising therefore that more generative modifications were not made at all stages for this document.
The first team were very familiar with each other's work and writing styles. Having produced numerous reports and papers together they have evolved a production style that appears "efficient" (however that is measured) and allows all authors to have several opportunities to influence the development of the document. They produced more draft versions but made fewer modifications than the other team. The second team, less familiar with each other both personally and in terms of writing spent more effort rewording and deleting text from each other's versions. Drafts were not circulated amongst everyone at all stages and the resulting document was stylistically more than content-wise altered by the time it was completed.
It may be tempting to assume that these differences reflect the familiarity levels of the authoring teams more than anything else. However it is possible that increased familiarity would not alter the general style which may result from the characteristic writing method of individuals or combination of individuals making up the collaborative team. Issues pertaining to subject matter, individual knowledge, the political structure of the team (who has the final say? who can criticise whose ideas? whose inputs are most important? etc.) and time availability are almost certainly all contributory factors here. Obviously such dynamics are worthy of further investigation and would need to be accounted for in any proposed model of collaborative authoring.
The second interesting issue and probably the most important one from the perspective of authoring environment design is the precise nature of collaboration that can be seen from these document records and what this tells us about designing writer-compatible technology.
What is most striking is the apparently little collaboration found in the document records. The first case certainly reveals the querying of details, objections to certain points and request for clarification by all authors on several issues at the start but these are virtually absent from the final two drafts. Similarly, with the second document there are several suggestions and discussion points in the first two drafts but these become non-existent after that. What has happened to the suggestions and counter-suggestions, discussions and arguments, agreements and disagreements that one would assume to be part and parcel of the collaborative authoring act?
It seems that debate about content and issues was largely concluded prior to the production of the first draft and subsequently handled verbally among the relevant authors. For the first team at least, the present technology in its current socio-technical environment (shared office, compatible technology, high integration of established work practice) offers an acceptable medium for text production (it should be noted that their method of text production was chosen by them despite the opportunity to use a supposedly more supportive hypertext environment for this task). Two of the second team also discussed the contents of the first draft of their document in advance but had to handle all subsequent debate through the medium of the document or if need be via telephone. However, these authors did communicate via telephone on several occasions during the production of this document but as they had parallel issues to address not concerning the document, the only noted conversation about it dealt with issues such as deadlines and circulation rather than content.
The generalisability of these results is debatable and caution must be exercised. The authoring teams consisted in total of five different authors but four of these worked in the same organisation. From the perspective of the author involved in both teams there were noticable working practice differences that impinged on the task and such differences must be seen as major determinants of collaborative style. How research should tackle this issue is difficult to envisage.
The reliance of both teams on a primary author to generate a first draft may be interpretated by some as indicating that these studies do not really address collaborative authoring but reflect the process of a single author seeking feedback. This would be flawed reasoning however. Both teams set about with the intention of colaborating, chose to work this way and concluded that the documents were collaboratively written. These data, coupled with the findings of Beck (this volume) that less than a third of collaborative authors she surveyed produced documents without a primary authoring role emerging or being agreed, suggest that such a role is the norm rather than the exception.
The text type could also be a factor influencing the nature and extent of collaboration witnessed here. Both documents were relatively technical and had to be produced to deadlines. Elaborate discussion and debate would hardly have been acceptable to the authoring teams, particularly through the rather constrained medium of text. Cohesion and adherence to a single view was brought about through pre-draft activities and discussions. Remaining debate seems to have been taken care of in the first couple of drafts. Certainly the document creation processes described here were typical of the participants' professional authoring activities and they are examples of real world collaborative authoring but it remains the case that studies in other domains may reveal distinct patterns of activity.
The investigative method used in these analyses also deserves comment. It is clear that the analysis of document records is perhaps a far from ideal method of studying collaborations amongst authors and the present studies would have been improved with more information on the nature of all discussions between authors, especially as it is these aspects of authoring that seem to reflect the greatest collaboration according to these data. Certainly this is an issue to address in further studies. Participation in the studies by the investigator, particularly without the the prior knowledge and consent of all parties throws up ethical as well as practical issues. Given the nature of the material, the direct involvement of the investigator in the main aim of the text production task and the agreement of all participants to proceed with the analysis, it is not seen as a major concern here but potential investigators should be wary of attempting such analyses without preparation. However, the method does provide a reasonably unobtrusive account of what happens when more than one author attempts to produce a document. To date, such accounts have been thin on the ground.
If we accept a definition of collaborative writing as the activities involved in the production of a document by more than one author then pre-draft discussions and arguments as well as post-draft analyses and debate are the collaborative components. However, such a broad definition will always embrace activities that are not necessarily directly relevant to the task of writing and render the quest for a technology to support the collaborative writing task partially indistinguishable from other collaborative technology designs. This is no bad thing. In a very real sense we do not need a collaborative writing tool per se, we need better technology for collaboration e.g., better communications facilities enabling people to contact each other, transfer documents, graphics, audio and video images easily and reliably. Given this, many technology-mediated tasks will be better supported. Writing will be just one of these (see e.g., Baydere et al, this volume for details of multi-author writing with a CAD/CAM system). The search therefore for a dedicated collaborative writing environment is too narrow because in itself, collaborative writing is a misleading term.
We are in the midst of a trend that emphasises collaboration as the dominant theme in advanced work technology design. Whether this results from a justifiable if under-critical reaction to the standard information processing view of human activity espoused for the last 20 years by most psychologists and ergonomists or from a desire to keep up with the current emphases in cognitive science is not clear to me. While I hope it is the former, and cannot find fault in the view that all work is in some sense a collaborative act, I would advocate a more realistic view of collaboration that doesn't insist on it being the major factor to concern ourselves with in ergonomic system design. Certainly it has a place, it is a component of most tasks but it is not necessarily the most essential component, that which we must address to the relegation of all others (for a more balanced view of the relevant system design issues see Eason, 1991). The major problem with much of the literature on collaboration remains the translation of its undeniably valid perspective into applicable design advice (see Newman and Newman, this volume for example).
Certainly in the authoring situation there appears to be a meaningful distinction between the act of writing which is intrinsically an individual activity and the writing task which may involve all types of collaborative acts, in much the same way as traditional task analysis breaks jobs into tasks into actions. But such collaborative acts are not necessarily unique to the writing situation but reflect the manner in which most human activities proceed. In collaborating in this fashion, authors may expend considerable effort dealing with issues that are not even contingent on the writing task but deal with related subjects and work activities. The stages of document production outlined at the start of the discussion demonstrate the fact that before actually drafting a document, the participants were involved in activities that could not be deemed uniquely "writing" work. The role of the technology surely is to support people and to this end designers should not try to control or manipulate collaboration but just concentrate on providing the most transparent media possible and let the naturally occuring processes of group working take care of themselves. To concern oneself directly with supposed collaborative authoring environments is to risk losing sight of this and hyping up what may only be another technological solution looking for a problem.
The author would like to thank all participants in these investigations for their permission to analyse and publish the data. Thanks are also due to Mike Sharples, Eevi Beck and a further anonymous referee for their encouragement and sound advice on improving this paper.
The work on collaborative writing was funded by the British Library Research and Development Dept. and carried out by the author at HUSAT as part of Project CHIRO.
Baydere, S., Casey, T., Chuang, S, Handley,, M., Ismail, N. and Sasse, A. (1992) Multimedia conferencing as a tool for collaborative writing: a case study. In M. Sharples (ed.) Computer Supported Collaborative Writing, London: Springer.
Beck, E. (1992) Collaborative writing: a questionnaire survey. In: M. Sharples (ed.) Computer Supported Collaborative Writing, London: Springer.
Eason, K. (1991) Ergonomic perspectives on advances in human-computer interaction. Ergonomics, 34(6) 721-741.
Hahn, U., Jarke, M., Eherer, S. and Kreplin, K. (1991) Co-AUTHOR: A hypermedia group authoring environment. In: J. Bowers and S. Benford (eds.) Studies in Computer Supported Collaborative Work: Theory, Practice and Design. Amsterdam: North Holland.
Hayes, J. and Flower, L. (1980) Identifying the organization of writing processes. In L. Gregg and E. Steinberg (eds.) Cognitive Processes in Writing. Hillsdale N.J.: Lawrence Erlbaum Associates.
Newman, R. and Newman, J. (1992) Social writing: premisses and practices in computerised contexts. In: M. Sharples (ed.) Computer Supported Collaborative Writing, London: Springer.
Sharples, M., Goodlet, J. and Pymberton, L. (1989) Developing a writer's assistant. In: N, Williams and P. Holt, (eds.) Computers and Writing, Norwood: N.J. Ablex.
Trigg, R. and Suchman, L. (1989) Collaborative writing in NoteCards. In R. McAleese (ed.) Hypertext: Theory into Practice. Norwood N.J.: Ablex.
Wason, P. (1980) Specific thoughts on the writing process. In L. Gregg and E. Steinberg (eds.) Cognitive Processes in Writing. Hillsdale N.J.: Lawrence Erlbaum Associates.