Online Publishing Platform, 2010

A proposal submitted to (and rejected by) the National Science Foundation in 2010.

Abstract

Proposed here is a concept we call Evolutionary Publishing (EvoPub) and the continued development of a system to enable it called Hubs of Science, Technology, Engineering & Mathematics (HoSTEM). EvoPub is similar in concept to Wikipedia but tailored for the needs of scientific publishing. The idea is to encourage collaboration and the timely disclosure of information, and alleviate some of the flaws in the current publishing model. HoSE is based on the system powering Wikipedia to allow content to continually evolve so that it never becomes dated and always presents the current state of knowledge on a topic. HoSTEM will consist of three layers of detail with the initial layer comprised of curated Living Review articles. Above (but below in detail) Living Reviews will be STEMpedia which will house articles intended for the general public. The bottom layer, Open Source Publishing (OSP), will be were users post, discuss and edit complex technical documents, raw data, and computer code using open-source web-based software. OSP will evolve into a set of online journals with editorial boards and peer review systems.

The software to enable HoSTEM consists of two components. The collaborative writing environment is enabled by a highly customized version of MediaWiki. It will automatically typeset the document into a readable format; generate and number equations, figures, tables, and references; and generate cross-reference links for easy navigation. A web application will provide the user interface for importing, generating, and editing content. A collaborative discussion environment will be based on a highly customized version of the blogging software, Drupal. It will be tightly integrated into the collaborative writing environment. The entire package will be bundled for use by any organization. Due to the scalability of the software used, these organizations could be anything from a small research group looking to more effectively collaborate to a large international society. The PIs will benchmark the software in a fast growing international online community consisting of physicists and mechanical engineers. A demonstration site containing this proposal, sample papers written by the PIs, and notes from a class on metamaterials generously donated by Graeme Milton and Biswajit Banerjee is online at http://dssl.mne.psu.edu/nsfsdci (no longer active). Anyone can view the content and see the original source by clicking on the edit tab at the top of the page.

1. Introduction

While the number of publications available online has exploded, the format and basic content of a 'paper' has changed very little. The need to publish only self-contained, finalized and static (archival) work that made a significant contribution to the existing literature was necessitated by the modes of distribution existing half a century ago. Technology has significantly decreased the price to produce a page of typeset material but the need to access and the volume of that content has increased. Thus, while the digital age has made it easier and less costly to publish material any university librarian knows that the financial pressure to keep a comprehensive collection of journals whether in science, engineering, mathematics, medicine, or the social sciences has dramatically increased.
That is not to say there have not been advances in publishing. Submission and reviewing is now done electronically. Reader/reviewer interaction is being implemented by the Public Library of Science (PLoS) with their special PLoS ONE [1] journal ($1250 per paper) and other publishers such as Nature with Nature Preceding and the European Geosciences Union with Atmospheric Chemistry and Physics (ACP). With ACP [2], submissions are done as any other journal and the editor, with assistance from select reviewers, pre-screens the papers. If accepted, the paper is professionally typeset and posted as a 'discussion' paper. The paper is open for discussion for eight weeks during which time the official reviewers submit reviews (which are open for all to see) and registered users comment. The authors can respond at any time to the reviews. The authors submit a final revised version for consideration by the editor. The editor then decides to accept the paper into the archival ACP journal or reject it.
PLoS ONE articles are reviewed traditionally. If a paper is accepted by the editors, it is typeset, but not copyedited, and published[1] The paper is then open to annotate and discuss.
Figure 1: Architecture of HoSTEM, a prototype Evolutionary Publishing system.
Nature Publishing's system called Nature Precedings [3] is described on their web site as "... a place for researchers to share documents (...). It provides a rapid way to disseminate emerging results and new theories, solicit opinions, and record the provenance of ideas. It also makes such material easy to archive, share and cite." They will accept any work that fits into the scope of the journal-biomedicine, chemistry and the earth sciences; they explicitly do not accept articles in the physical sciences. All submissions are converted to PDF and cannot be edited once submitted. Users can post (attributed) comments about any submission. A voting system is used to acknowledge good submissions and rank them in terms of popularity. While Nature Precedings is a great place to publish preliminary ideas and findings, much of science, engineering, and mathematics is excluded from it. It is also not designed to foster virtual communities or allow content to continually evolve.
New technologies will no doubt continue to be merged into existing publishing system. However, recent technology has fundamentally changed the underlying landscape of publishing and what was a near optimal dissemination system ten years ago would appear no longer to be so. This technology has given rise to a very successful social experiment: Wikipedia [4]. With very few rules and little moderation, Wikipedia has demonstrated that (semi-)open collaborative publishing environments can indeed converge into something stable and useful, and not diverge to chaos as most anticipated. Wikipedia has matured and stabilized into one on which many students and researchers rely.
Wikipedia centers around a web application called a wiki. The basic idea is that content on any page can be edited directly in a web browser by any authorized user. Simple markup languages help with formatting and embedding multi-media content. The content is stored and web pages are formatted dynamically from the most recent content. An essential component of wikis is a version control system; all versions of a page are archived and edits attributed to a user. 'Watchers' of a page are notified automatically whenever a watched page is modified. Thus, the system is quite easy to police, with vandalism quickly removed and vandals blocked.
Wikipedia is not without its problems, and there have certainly been some growing pains. One of the biggest criticisms is that there is no one directly responsible for the accuracy of information. The hope is that the community can police itself. This has largely been the case due to a register-toedit requirement and there being no real benefit to intentionally posting misinformation. A policy of not permitting opinions, an effort to cite references, and behind-the-scenes discussion pages have contributed to keeping the content accurate. An initiative to have edits to popular pages validated by the community before they are displayed will help quell occasional vandalism, inflammatory statements, and misinformation.
Here we propose a system called Evolutionary Publishing (EvoPub) and a prototype implementation called Hubs of Science, Technology, Engineering, and Mathematics (HoSTEM) that will go beyond just enhancing existing publishing systems by providing a more timely, efficient, usercontrolled, and collaborative environment to bring scientific publishing into the 21st century; see Fig. 1. We will develop and promote the infrastructure and software needed to break out of the local minimum to which we have equilibrated and start searching for a new optimal state. The following sections outline necessary changes in both the social structure and the underlying technology to make the system attractive to academic researchers.

2. EvoPub: Revolutizing Research Dissemination

There are many compelling reasons to change the current system for scientific publishing. Timeliness is one important improvement. Ideas and results could be published as they are generated. The peer review system, which appears to be collapsing under the weight of the number of papers submitted to journals, could be revamped to reward those that do thorough and timely reviews. Collaboration would be encouraged through research discussions. Researchers could be rewarded based on how their work contributes to the field. Everything becomes dynamic (evolutionary) and can be updated as new results are discovered and/or errors found. The negligible incremental cost of publishing another 'page' would mean there could be no (page) limits on the amount of information that can be included; although beyond a point readability would suffer. The community decides what is important by the information they use and how they rate content. Typographical, grammatical, and factual mistakes can be quickly corrected with users providing editorial skills.
There is a good analogy to be made by comparing traditional encyclopedias and traditional scientific publishing to Wikipedia and EvoPub. The benefits that Wikipedia has over, say, the Encyclopedia Britannica are indicative of the benefits EvoPub will have over our present system. One of the great advantages is the timeliness of the information. Information on existing pages can be changed continually as events unfold or discoveries are made. In addition, new topics can be added or branched from more general ones at any time. Another important benefit is that no one person is in charge of writing a comprehensive article about, say, radar. Pieces of information can be added by whomever whenever. People can fill in small bits of information and the expertise of many people with knowledge of specific areas of a topic can be combined into a comprehensive article. Thus, collaboration is inherently promoted and communities naturally form around topics.
While the idea for EvoPub is based on Wikipedia, a transition away from traditional publishing will require a system, HoSTEM, tailored for the purpose.

3. HoSTEM: Flexible Software to Revolutionize STEM Publishing

HoSTEM will consist of three independent but connected hierarchical layers. The middle will consist of Living Reviews and be constructed first. The top and bottom layers, STEMpedia and Open Source Publishing (OSP) will be developed subsequently and evolve with the user base.

3.1. Living Reviews

Like traditional review articles Living Review pages will summarize the state-of-the-art in a given topic. Because the APS copyright transfer agreement gives authors permission[2], the PIs will contact authors of articles in Reviews of Modern Physics to allow their inclusion in Living Reviews. The PIs will also solicit respected members of the community to contribute original review articles. These articles will be converted to wiki pages that can then evolve as new discoveries are made or omissions in the original article included. They will also have associated discussion pages where changes to the main article are suggested and debated.
Unlike the freely editable pages on Wikipedia, Living Reviews will be curated by a committee. The curators will decide, based on discussion, if and when a Living Reviews should be updated and who should make the update. For example, if there is a review article on the mechanical properties of carbon nanotubes and group X is finally able to determine experimentally how and when they transition into nanoribbons, either a member of group X or someone familiar with the work would suggest an update to the review article. The curators would then request that someone submit an update to the review. Upon approval of the curators the updated review would be published. Thus, the curators would have a similar role to an editorial board. Quality would be assured.
Such a system would also overcome a weakness in the Wikipedia system and a necessity in academia: a peer review and reward system. A side bar on each review page would list contributors and the number of contributions s/he has made to that page's content. Being added to this list would acknowledge a person's contribution to the topic. Because all revisions are archived, users can readily highlight individual contributions.

3.2. STEMpedia

This layer will be tailored for those outside of the research community, such as K-12 and undergraduate students. These pages will consolidate the knowledge contained in the Living Reviews and present it in a simplified way, similar to encyclopedia articles, and will also have a committee of curators to consider suggestions for changes and additions. Members of the community will be solicited to write these articles, which will be edited for readability and content via the collaboration a PSU English; see below. STEMpedia will use the same technology as Living Reviews.

3.3. Open Source Publishing

Below Living Reviews will be a layer we call Open Source Publishing OSP, for disseminating all types of research. This layer will evolve into two separate sub-layers, one (quasi-)static and one dynamic. The quasi-static one will resemble a collection of online journals and contain finalized, peer-reviewed publications with relatively static content. The other will be less formal containing incomplete, unfinished, or unreviewed (developing) work and even basic ideas for future research directions.
The supporting software for OSP will be similar to the supporting software for Living Reviews and STEMpedia. Because we want to see this evolve into a complete, user-friendly, online scientific publishing system, much effort will go into developing a 'web app' user interface, something like 280North's 280Slides [5] for STEM publishing. 280Slides was built using 280North's development platform, Atlas, and their open source web application frameworks, Cappuccino [6] and ObjectiveJ. This framework is especially appealing to the PI because it is modeled after Apple's Cocoa frameworks and the Objective-C language which the PI has used extensively for developing iPhone applications.
The ultimate goal of OSP is to have the system morph into a collection of online journals maintained by the community for the community. As a specific example consider OSP-JIT 2 OSP-JIT 2 "OSP-JIT"^(2){\textit{OSP-JIT}^2}, the Open Source Publishing Just-In-Time Journal of Interesting Things. OSP-JIT 2 OSP-JIT 2 "OSP-JIT"^(2){\textit{OSP-JIT}^2} will categorize content as accepted and developing. The developing section will be much like the initial phase of OSP, where anyone (registered) can submit anything and designate it as publicly viewable or not. If an author would like something in this category to be moved to the accepted category s/he will notify a member of the editorial board to officially open the paper for review, with the editors asking respected members of the community to give formal evaluations. These evaluations will be part of discussion pages associated with each article and could be a collection posts the reviewer makes as s/he reads the paper; reviews can then be submitted in multiple parts as questions or concerns arise.[3] The authors will then have an opportunity to rapidly reply or even modify the paper based on the reviewer's comments. Once the reviewers have completed their evaluations they will notify the editor who will make a decision based on the reviewers' suggestions and any other discussion of the paper. If the paper is accepted the content will be marked as such and appear in the static (archival) section of OSP-JIT 2 OSP-JIT 2 "OSP-JIT"^(2){\textit{OSP-JIT}^2}. If the paper is rejected it will remain in the developing category for as long as the authors desire. Changes can be made and the authors could ask the editors to reconsider the paper and again open it for official review.

4. Community Acceptance

Community acceptance of a new paradigm in publishing will not be easy. Faculty may worry about how administrators will evaluate work published in OSP. Contributors may be hesitant to allow other users to edit their work. Discussions could get bogged down in endless debate. Lowquality submissions could clutter the system and overwhelm the high-quality ones. Users will also likely be hesitant to move away from their current manuscript writing environments. The following paragraphs will address these issues and discuss a plan for gaining community acceptance.
The first step in getting the community to use HoSTEM is to put something compelling and valuable in the system. This is why the initial effort will be to build Living Reviews. There are already a great many review articles posted at arXiv.org that have been published in journals. Along with PDF versions of the articles, most also have the source LaTeX and figures posted. These source documents can readily be imported using software already written by the PI. Living Reviews can then be quickly populated with articles and provide value to the community with little effort. Obviously this will all be done with the permission of the original authors who will be asked to curate the article. Each article will have an associated discussion forum which the curators can monitor for suggested revision. There will be no minimum commitment from the original authors to make revisions and they can simply let the page remain static if they desire. However, the hope is that they will let others with more interest in updating the article to keep it current. With a few months of additional development and a dedicated effort to contact authors of existing review articles, HoSTEM could have hundreds of articles online with almost no effort from the community.
As seen with iMechanica.org this growth will likely be slow at first as users discover the site and word spreads. Once users are comfortable with the site, discussions are likely to start, suggestions for article revisions made, and communities and collaborations grow. The PIs have also discussed this with members of arXiv.org's board of directors who are interested in the concept; see supporting letter. Linking with arXiv.org would help greatly in making the community aware of HoSTEM.
Once Living Reviews is populated with substantial content, the PIs will start soliciting the community to write articles for STEMpedia. Articles that address STEMpedia's target audience will necessarily be originals. The hope is that authors of Living Review articles will allow their work to be reused reducing the work necessary to put together a STEMpedia article. However, STEMpedia articles will require much effort by the authors to clearly summarize the material to an audience that may be learning about it for the first time. The Penn State English Department internship program will be an integral part of improving the readability of STEMpedia articles. STEMpedia will be the public face of the project so it will be given much attention.
The final envisioned form of OSP, the lowest layer of HoSTEM, will have the most difficulty gaining acceptance, requiring the community to rethink what it means to publish and review research. While the generation comfortable with Wikipedia, Facebook, Slashdot, blogs, RSS feeds, etc. might be comfortable publishing their work in OSP, must researchers are conservative when it comes to publishing. They, rightly, want to publish in well established places that provide a large audience and have sufficient reputation and prestige. Not only will OSP have the burden associated with starting a new journal, but it will also have the burden of asking researchers to publish in a new way.
OSP will be rolled out in phases to a much greater extent than Living Reviews or STEMpedia. At first it will simply be a place for authors to post their work much like any other e-print archive. To get the system started the PIs will seek permission to convert the LaTeX source from those who have already posted their work in other archives. In this way OSP can be quickly populated with content of interest to the community. Each paper will have an associated discussion thread for comments. The authors will decide if edits (published upon approval) are allowed. When changes are made only the original author of the content and the author of the changes will be able to see them. If changes are rejected they will remain in the system for their author to modify if desired. If accepted the changes will be incorporated into the publicly viewable page. All versions will always be saved so that alternations can be removed.
Researchers will also be able to post original work using conversion programs to make the content editable. The work could be anything from a completed paper that has also been submitted to a journal to something in the preliminary stages for which the authors would like feedback. Posting preliminary work and ideas will allow researchers to not only get feedback but also be recognized for having the idea long before a finished paper has been completed, reviewed and published. OSP will then give researchers a forum to present their ideas (what really matters in a publication) in an extremely timely manner and long before all the details are worked out. This should also help foster collaborations as others may have had or are pursuing similar ideas.
One problem with encouraging people to post anything and everything is that the signal to noise ratio can become low and users frustrated trying to find useful information. This has been an issue with the explosion of Internet content, and people have put a great deal of effort into filtering content for specific interests. OSP (and HoSTEM in general) will include powerful search algorithms to help users find content. Also, because searching is only useful to those knowing what they are looking for, all content will be categorized based on keywords provided by the authors and results will be sortable by many criteria. Many RSS feeds based on keywords will also be available to passively obtain titles and abstracts from newly submitted content. Additionally, one of the two rating systems to be included will allow users to rate content as 'interesting' or 'uninteresting' to them. These rating will be similar to those used by RSS feed aggregators such as reddit.com that learn a user's personal interests to help suggest content. Finally, based on the content being viewed and the user's personal rating history, other content will be suggested. These strategies will help make the system as useful as possible for those looking for specific topics as well as those just browsing.

5. Stimulating Collaboration

A fundamental goal of HoSTEM and EvoPub in general is to encourage collaboration.

5.1. Collaborative Writing

Because the content will be editable, collaborative writing will naturally be promoted. The basis of HoSTEM was originally developed by the PI for collaborating on proposals. To encourage small groups to collaborate, users will be able to mark content as readable and/or writable only by designated users. This will encourage users to collaborate in the system and only allow others to see the content when it is ready. Users will also be able to maintain and share bibliography databases and secure repositories will exist to allow users to exchange data, figures, etc. Researchers will also be able to install the system locally.

5.2. Collaborative Discussion and Reviews

Having used Wikipedia for many years, the biggest weakness is its environment for discussion. Discussion pages are identical to the content pages and require users to format them appropriately. Unfortunately, this freedom and lack of structure usually means discussions are difficult to follow. Because discussions are an essential part of EvoPub and all layers of HoSTEM, a much better mechanism will be developed.
Unlike wikis, blogs were developed from the beginning to promote and organize discussion. Thus, each content page will be linked to a discussion page maintained as a blog entry. The blogging software Drupal is a good open-source solution; PI Li has used it for the successful site iMechanica [ 7 ] [ 7 ] [7][7] which has over 3,000 registered users and 30,000 hits per day. A major effort will be to tightly integrate Drupal and MediaWiki so there is a seamless user experience.
This discussion system will also act as the peer review system for OSP once it is organized into journals. While all users will have to be registered to post comments or official reviews, the system will allow these posts to be made anonymously. Entries from official reviewers assigned by the editor will be marked as such so authors can be sure to respond. By default, entries from official reviewers will be anonymous so current practice is maintained. However the PIs hope that those writing thorough and useful reviews will 'sign' their posts.

6. Enabling Technologies

There are many different software packages that implement the wiki concept. Wikipedia uses MediaWiki which is "..free server-based software which is licensed under the GNU General Public License (GPL). It's designed to be run on a large server farm for a website that gets millions of hits per day. MediaWiki is an extremely powerful, scalable software and a feature-rich wiki implementation, that uses PHP to process and display data stored in its MySQL database." PHP is a popular and actively supported server-side, object-oriented, scripting language. There is a wealth of open source code and classes that one can modify or used for inspiration and education. The PI has been developing web applications using PHP for the past seven years.
The MediaWiki code-base is being actively improved and refined. Security vulnerabilities are quickly fixed; the vast majority of them have likely been discovered by a large community of developers working with the code. The rapid growth of Wikipedia has lead to efficient strategies for distributing the computational load via database replication and caching of processed content. With scaling strategies in place, interface changes using Web 2.0 technologies such as Ajax (Asynchronous JavaScript and XML) and HTML5 are starting to refine the user experience and make editing easier.
MediaWiki has also been developed with extensibility in mind. Many outside modules have been developed (see [8]) to both extend the markup language for site-specific needs and to add completely new functionality. This is convenient for the needs of HoSTEM because MediaWiki can be modified extensively without having to make changes to the core code. The HoSTEM modifications have so far been done with minimal modifications to the MediaWiki code; this practice will be maintained with further development. Performance and security enhancements to the core code will then be easily kept up-to-date.
MediaWiki will power the collaborative writing environment as the PI has extensive experience extending its functionality, using it to write over ten collaborative proposals and papers over the past seven years. This way of writing has been extremely productive and many others have shown interest in the idea. The wiki used to write this proposal is open to all and can be found at http://dssl.mne.psu.edu/nsfsdci. Figure 2 shows a screenshot displaying part of a paper [9] used to test the system. Key features are enlarged and explained in the caption.
These features are modifications to MediaWiki developed by the PI, and essential for STEM writing. Conversion from an extended wiki markup language to HTML and to PDF is done by software developed by the PI. The planned web application will sit on top of this system to ease the transition of users accustomed to word processors. An equation formatting module is already being tested and modules for making tables, adding figures, citing references, and cross-referencing are planned.
Of all the discussion forum packages available, Drupal appears to be the best option for HoSTEM. It is written in PHP and uses the same supporting software as MediaWiki. Again, Drupal is used for some very large sites and will scale to the needs of HoSTEM. In addition to the PIs being familiar with the workings of Drupal, the large community at imechanica.org is already familiar with its interface.
Although the MediaWiki extensions will be developed primarily to run HoSTEM it should have universal use for any group wanting to collaborate. Thus, platform independence will be an important consideration during development. Fortunately, MediaWiki, Drupal and our custom extensions depend on only free and open source software that has been ported to nearly every host operating system. The entire package will be bundled for use by any organization. Due to the scalability of the software used, these organizations could be anything from a small research group to a large international society.
Figure 2: Screenshot of a paper formatted for online view. Highlights include equation rendering from source, automatic numbering and cross-referencing (with hyperlinks), pop-up equation references, baseline shifting for inline equations.

7. Motivation

From a practical perspective, a revolution needs to occur as a sequence of evolutions (none too big) pushing people from their comfort zone. Hopefully, in five years we will look back and wonder why things were ever how they are now. There are three main reasons the PIs feel change is needed:

7.1. The Cost of Content

While for-profit publishers will argue that they provide an important service to the community at a reasonable price, as the chair of Penn State's Faculty Senate Libraries Committee, the PI has seen firsthand the problems university libraries face as they try to keep up with the skyrocketing cost of ever more restrictive content licenses. The financial statements of for-profit publishers are the best place to get the true picture. For example, in 2006, Reed Elsevier's journal publishing division generated 2.24 billion euros and earned 683 million euros in profit, with 52% coming from scientific journals and 48% from medical journals. Net operating margins for the division are high (30%) because the content and assessment via authors, editors and reviewers comes free. While for-profit publishers are certainly investing some of their profits into better and more convenient publishing systems, their mission is still to provide the best return on investment to their shareholders, not to the people providing the content.
The enormous scatter of journal subscription costs is also telling and suggests that the market is being driven by monopolistic practices and not free exchange of information. Ted Bergstrom (economics, UCSB) has done much research on the cost to end users of journals, and he and Preston McAfee (economics, CalTech) have an excellent web site [10] estimating the value of journals in various ways. A summary of the data [ 11 ] [ 11 ] [11][11] shows that for-profit publishers charge significantly more for access (even when normalized by citations) than non-profit publishers ($26.41/article vs. $6.77/article). This primarily measures the cost relative to the contribution and is only indirectly related to the cost of production. When costs are normalized by page area, for-profit publishers still charge significantly more. For example, the cost for both the print and online version of the ASME Journal of Applied Mechanics is roughly $ 10 / m 2 $ 10 / m 2 $10//m^(2)\$10/\mathrm{m}^{2} while that for Elsevier's International Journal of Solids and Structures is $ 30 / m 2 $ 30 / m 2 $30//m^(2)\$ 30 / \mathrm{m}^{2}. This difference is due to the monopoly effect of full copyright transfer; once a given journal has the copyright of a given paper, libraries are forced to subscribe as researchers demand access. Libraries are then forced to pay the fees or have a gap in their collections. Imagine, for example, if each paper was involuntarily licensed to exactly two journals, and libraries could subscribe to a journal article by article. That would immediately force competition on cost and normalize fees.

7.2. The Deterioration of the Publishing Community

The PIs feel that the current peer-review system is collapsing under its own weight. However, it is an essential part of the academic system that ensures quality and allows administrators to evaluate work in unfamiliar fields. Alas, reviewing is a job that provides the reviewer no real benefit and is barely considered for promotion and/or tenure. The number and quality of reviews is never requested and does not help one's reputation, standing, or promotion success. A very small number of people (journal editors) know whether a person is a thorough and timely reviewer. Thus, optimizing P&T success pushes the time spent reviewing papers to zero while the time spent writing them is maximized. It appears that more and more people are realizing this and it is getting harder and harder to get people to provide quality and timely reviews.
This disparity in the 'reward' system for writing and reviewing papers continues to grow and, the PIs fear, will soon reach a breaking point. Most editors will now make a decision based on two reviews when the standard used to be three. Even getting two can be difficult as it frequently requires multiple reminders. The PIs must also admit to often being slow to review papers as there always seems to be more pressing matters to attend to that are actually rewarded.
One of the reasons for this growing problem appears to be the continuing decline of a community spirit among academics. People used to, or so the PIs are told, feel much more part of a community and, thus, were much more willing to contribute to that community by doing behind-the-scenes work, such as paper reviewing, for which there is no direct benefit. While there have always been (and always will be) people who will not contribute, the lack of a community spirit and an attitude that everyone is competing with one another tends to push tasks like paper reviewing aside for much more profitable endeavors.
The PIs can only speculate on reasons for the dissolution of academic communities. They suspects one reason is the easy and anonymous access to information. While the information age has certainly benefited academics in profound ways, it has also made research more anonymous. Before everyone had easy access to online journals and photocopying, the only way to get a paper was to contact the authors directly. The authors were then aware of the people interested in and following their work; this often resulted in further discussion and collaboration. Today there is no real way to know who is reading one's papers. The only indication usually occurs when authors notice citation from other researchers with whom they have never spoken.
The PIs also think the competitive environment in which we work and position ourselves for success hampers the community spirit. With the funding rates for science and engineering so low, academics view others in their community as competitors and are less willing to share ideas fearing intellectual property theft. This is truly an unfortunate development in the academic community where pursuit and sharing of knowledge should be the top priority. With academic institutions acting more like big businesses, it is not hard to see how this competitive attitude permeates down to the faculty. While competition between researchers is nothing new and generally healthy, the additional pressures on today's academics lead to more unhealthy competition as people feel their advancement is tied to their success at 'winning' funding, awards, etc.
Another likely cause of this deterioration is internationalization and efforts by more academic institutions around the world to develop strong research programs. With so many more academics having their career advancement tied to their research success, academic communities are becoming more like large cities than small towns. Everyone passes each other anonymously and people treat each other with little respect as there is little chance of ever encountering one another again. While it is still a small world, the growth coupled with anonymity is deteriorating an important sense of community that contributes in central ways to the quality of research practices and results.
The PIs hope EvoPub and HoSTEM in particular will foster a sense of community. Discussion pages will help identify those interested in an author's work and promote collaborations and dialogue in the community. Authors will have access to statistics about when and, broadly, by whom their work was viewed. Reviewers will be able to write either attributed or anonymous reviews (or alter the classification) so good reviewers can be recognized by the community. Authors will be able to rate the reviews on their usefulness and reviewers will build a rating based on both their attributed and anonymous reviews.

7.3. The Need for Timeliness

As the pace of discovery and competition to be first continues to increase, researchers are growing more and more frustrated with the time it takes from submission to publication. Even the time it takes to turn an idea into an acceptable submission is often much too long. While this is controlled, in part, by authors, in most engineering disciplines a short paper with a basic idea not supported by lengthy analysis and examples demonstrating usefulness will not be accepted. Thus, such papers are never submitted and the time it takes to get ideas out is unnecessarily long. This is somewhat less of a problem in science and mathematics. In the PIs' opinions, ideas are the important part of any publication and should be disseminated as quickly as possible. Surveying publications (and theses) will demonstrate that there is usually one fundamental idea (if that) in each publication that could be summarized in probably a page. The rest of the paper is there to support the validity of the idea.
A more dynamic publishing environment would allow authors to present their ideas rapidly and then fill in supporting material as it becomes available. A worry would be that subsequent study might show that the original idea was not good (something that can happen, by the way, in more traditional approaches to publishing). However, such 'papers' could be maintained in the dynamic section of OSP for comment and dissemination. If a breakthrough occurs the authors could ask the editors to consider it for the static section of OSP. If subsequent work does not validate the idea, the authors could retract the work, or leave it posted indefinitely for further comment and the education of others who might be considering similar ideas. This will allow a forum for ideas that did not work out and hopefully prevent others from going down the same path. This would be a tremendous learning resource for graduate students. Ideas could even be revisited later by the original authors or others might try to determine why a seemingly good idea did not materialize into something useful.
Another process that would become much more rapid is the review-revision process. In discussions with journal editors and many other faculty involved in reviewing papers, the PIs believe that a large reason for the delay in getting papers reviewed is that reviewers do not have time to write a thorough review rapidly. Thus, while most reviewers will immediately skim through a paper and form an opinion, they delay the process of actually writing and submitting the review until they have more time to think about it. This 'more time' never comes and under pressure from an editor the reviewer basically writes what s/he originally thought. In a dynamic reviewing environment, a reviewer need not worry about writing a lengthy or well thought out review right away. The reviewer is free to post some initial thoughts that can then be expounded upon later. At the same time, the authors can address these initial thoughts by either clarifying the reviewers understanding and/or revising the paper. Thus, the reviewing process becomes much more finely discretized, resulting in much less time between submission and publication, and more importantly between idea generation and public dissemination and feedback. The loop is shortened in ways that can only benefit authors and the research community at large.

8. Sustainability

Making sure software is sustainable after the funds supporting its development are gone is something developers must think hard about. The PIs are aware of many software projects that stagnated once an external agency no longer supported it. Unless someone has a personal need necessitating continued development it is hard to justify the uncompensated time needed to make improvements to solve other people's problems. We are also aware that there will be significant recurring costs associated with running an EvoPub system. Estimating such costs is not easy but can be made based on operating costs of non-profit publishers and e-print archives.
While many non-profit disciplinary societies publish scholarly work, most still have print editions that dramatically increase cost, especially on a per subscription basis. Additionally, the financial reports the PI has studied (ASME, APS, IEEE, AIAA) do not provide an expense breakdown for just scholarly journal publishing, instead lumping all publishing expenses into one category. The PI did however find that most non-profit organizations make a profit on publishing, using the proceeds to subsidize other money-losing but beneficial activities.
One can also estimate the cost of publishing through PLoS's financial statements [12]. In 2006, their total expenses were $6.3 million, of which $2.6 million went to salaries and wages with another $1 million spent on "employees." They spent $247,000 in rent and $183,000 on travel. Their total income was just under $5 million. Clearly PLoS is targeting the high end of publishing, trying to compete with Science and Nature. With substantial support from the Gordon Moore Foundation, PLoS has decided to build an expensive infrastructure with the hope that revenue will grow to meet expenses. Again, this does not provide much insight into the production cost of a typical science or engineering journal.
The cost to run arXiv.org is likely the best estimate of (providing at least a lower bound to) the cost a system like HoSTEM might incur as it grows. arXiv.org recently posted a business model white paper [ 13 ] [ 13 ] [13][13] that outlines the cost to run it and presents some ideas to distribute this burden and reduce its dependence on Cornell University Library's budget. The operating budget for 2010 is $400,000 with 80% going to staff salaries. Other expenses include "G&A overhead, hardware, hosting, and network charges." The cost per download is $0.013 and the cost per submission is less than $7. This is in stark contrast to the $1250 per submission PLoS ONE charges authors (without copyediting) or the roughly $1000 per submission the New Journal of Physics charges (apparently with copyediting).
The costs then appear to depend greatly on the support and editorial staff required and one of the advantages of the EvoPub concept is that the cost structure and pricing scheme can be whatever an organization wants. If everything is done by volunteers or if the system is being used by a small research group, expenses could be zero. If an organization decides to develop a highend publication requiring extensive editing, formatting, and customization, there will be significant recurring costs. Ultimately, the cost and the method by which to recoup those costs will be decided by each community.
The cost to run a system like HoSTEM will obviously depend greatly on the size of each of its three layers. Part of developing the Living Review layer first is that it is the least costly one to build and sustain. The minimal required copyediting could be handled at no cost by PSU English Interns. It is difficult to measure cost in terms of dollars per submission or dollars per download during the initial stage of development, when non-recurring costs have not been amortized. However, from a long term sustainability perspective the likely cost of maintaining the Living Review layer should be approximately that needed to maintain arXiv.org, i.e. about $7 per submission.
The STEMpedia and OSP layers will likely cost significantly more per submission. Because STEMpedia will be the public-facing side of HoSTEM, significant copyediting will be required to make the articles as accessible to, e.g., K-12 students as possible. The cost per article would then likely be on the order of $1000 (assuming the preliminary content is provided free by the community) initially and roughly $100 per year on average to copyedit new material added to an existing article. The cost of the OSP system will largely be decided by the community that contributes to each area. If a community wants to maintain a very high quality with extensive copyediting and possibly custom modifications to the supporting software, the cost might approach that of the New Journal of Physics. At the other extreme, if the community decides to use the system like they do arXiv.org, the cost could be as little as $10 per submission.
Most start-up costs for hardware and development time are included in this budget. Recurring costs needed to keep HoSTEM running can be recouped in a variety of ways, all of which will be investigated. Additionally, different layers of HoSTEM and different sections within a laver could be supported by different business models. First, if the community supports it, the PI is confident from discussions with friends in the web content business that advertising could cover the operating costs of HoSTEM without too much intrusion on the real content. However, advertising in the STEMpedia layer would certainly be a bad idea. Second, a small voluntary fee could be charged to those, say, submitting papers or having them accepted into the archival section of an OSP collection. The recommended fee would be changed periodically to adjust for excesses or shortages. Finally, much like arXiv.org advocates in their business plan, HoSTEM could be run purely on donations from organizations such as universities and the NSF. Any combination of these funding mechanisms could also work.

9. Collaboration with Penn State's Department of English

Professional editing is one of the benefits to publishing in traditional journals and something HoSTEM will need for it to be successful. This was initially a major concern but, after speaking with faculty in Penn State's Department of English, a few exciting solutions arose.
PSU English allows seniors to apply up to six credits of ENGL495 (Internship) towards a Technical Writing Minor. Internships are required for those working towards a degree with a publishing emphasis. Co-PI Jenkins coordinates this program and will initially setup and then maintain internships with HoSTEM. This arrangement should benefit everyone involved. HoSTEM authors will get valuable feedback from well educated students to make their work more readable. Likewise, the students will gain valuable experience in one of the fastest growing and employable careers for English majors. There is no fee to sponsor English interns, so this arrangement will be sustainable after the funding period has ended.
There are also two popular elective courses offered in English that could use HoSTEM for demonstration and/or projects. The first course is ENGL417 (Editorial Process: The process of editing from typescript through final proof) which enrolls about 100 students per year. This course is somewhat unique to PSU as most universities do not offer such a course to undergraduate students. The second course is ENGL418 (Advanced Technical Writing and Editing) which enrolls roughly 70 students per year and teaches them to prepare and edit professional papers for subject specialists. As stated in the supporting letter of Dr. Robin Schulze, English Department Head, "The system (and its development) will...be a resource for instructors and students taking ENGL 417 and 418 ."
During the grant period and before a great deal of content is available in HoSTEM, English interns will help with the developing and testing the system. In particular, they will help with documenting the system and provide feedback about the organization and usability. Additional ideas for efficient ways to categorize, browse, and search the content will be prototyped and tested. The PI has found in developing other web applications that it is important to have many people unfamiliar with the application test it. This helps make the user interface much more intuitive for first-time users and will be an important aspect of community acceptance.
The PIs are particularly excited about this unique multi-disciplinary collaboration and hope that it fosters more interaction between STEM disciplines and the language arts. These interactions will be extremely valuable to all involved.

10. Complementary and Alternative Solutions

There has recently been a push at universities and by NSF to allow researchers to use the web to more readily collaborate and share information and data. These initiatives have been focused more on data sharing and curation than on publishing. In this regard, they will complement the EvoPub model and, in particular, HoSTEM by being a place for researchers to store and share raw data, analysis tools, computer code, etc. Within the BigTen there are two particularly interesting systems. One is Purdue's HUBzero [14] which powers the NSF-sponsored NanoHUB. The goal of HUBzero is to make it easy to bring research software tools to the web. The PIs will explore effective ways to integrate HoSTEM into the HUBzero framework once the source code is released in April. Another exciting project that is just getting underway is the Big Digital Machine (BDM) [15]. This project is being developed by Indiana University with funding from the CIC universities' (BigTen+) CIOs and Library Directors. In their words, "The BDM is intended to help identify paths and hopeful insights for future models of an open, interoperable, sustainable, enterprise-level system for dissemination of scholarship..." This effort is being promoted by IU's Vice President for Information Technology & CIO, Bradley Wheeler, and a close collaboration is planned; see supporting letter. There are also many interesting advances in courseware publishing systems. For example, the Connexions system [16] at Rice University allows instructors to develop 'modules' of knowledge that can be aggregated, reused, and modified for the specific needs of a course. The source code for the Rhaptos platform on which it runs is now available and the PIs will explore possibile future collaborations. A competing courseware development system called DynamicBooks was very recently announced by Macmillan Publishers [17]. The details are still somewhat vague about how the system will work but from their web site: "DynamicBooks titles give you unprecedented flexibility to tailor content to your presentation of course material." Other publishing companies are sure to follow. While these textbook and courseware development tools could certainly be adapted to enable EvoPub and compete with HoSTEM, they are currently targeting a different market. The publishing companies are not likely to develop something like EvoPub as it undermines their business model. However, a strong connection with Connexions could be beneficial to all.
Another project from the University of Trento in conjuntion with Springer Verlag is LiquidPub [ 18 ] [ 18 ] [18][18]. This effort appears to have evolved since the PI first spoke with the developers in 2007 in hopes of collaborating. The developers appeared to have many of the same goals in mind at the time, and white papers they have posted echo much of the sentiment expressed here; they state, "The LiquidPub project proposes a paradigm shift in the way scientific knowledge is created, disseminated, evaluated and maintained." However, from demonstration videos they have posted, it appears LiquidPub is focused on aggregating and organizing existing content. For example, they say Liquid Journals "go beyond the traditional journal vision, proposing a new way of collecting, selecting and sharing scientific contributions with and within the LiquidPub community." and "...the system could be seen as an aggregator." Thus, it is still somewhat unclear if LiquidPub is working towards an EvoPub-like system and if it will complement or compete with HoSTEM.

11. Frequently Asked Questions

Since first promoting these ideas to the mechanics community in 2007, a number of questions have arisen that are addressed here.
Q: Why would anyone want to allow others to edit their work?
A: Convincing people this is a good idea may indeed be a challenge. However, Wikipedia's success can be used to establish the benefit. There were plenty of people skeptical of Wikipedia until it was shown that, with minimal policing, the idea of open source editing works. There are actually many benefits and few, if any, detriments to open sourcing one's work. All revisions are documented and can be easily undone. Revised content is only publicly viewable after an original author approves it. There is a record of who made the revisions as users will be required to register to edit. A manuscript with many grammatical and stylistic mistakes can be quickly improved. Papers could become a community effort with reviewers proposing entirely new sections.
Q: Why would anyone want to edit other people's work?
A: Edits could be as simple as correcting typographical or grammatical errors to writing entirely new sections. Because anyone wanting to edit something will know that it can only be published with the original author's approval, an editor will likely contact the author before making substantial changes. Editing will likely be done mostly by students.
Q: Why would anyone want to write an attributed review?
A: One impetus for the proposed work was what seems to be significant deterioration of the publishing communities. It is no wonder, given that reviewers get essential no credit for being thorough, that the time spent on this tasks gets minimized. Unfortunately the community as a whole suffers. Thus, reviewers that sign their reviews, either initially or after others have supported their opinion, will be recognized for having good (or bad) opinions.
Q: Who will receive credit for work to which many people contribute?
A: When communities start to collaborate on things, attribution of credit can become an issue. Since all revisions will be tracked and the original author known, attribution of credit for authorship and reviewing will be easily determined. All contributors will be listed on the page with the page creator at the top.
Q: Who will be allowed to post, edit, and view content?
A: This will depend on the community deploying the system. For HoSTEM we will likely require editors to either have a PhD or be a graduate student in a related field. All registration applications will be screened to prevent abuse of the system. Everyone will be allowed to view finalized content in HoSTEM.
Q: Who will hold the copyright of material published in HoSTEM?
A: This will depend a great deal on feedback from the community but copyright transfer will not be required. While more study is certainly in order here, a Creative Commons [19] system like that used by Wikipedia seems like the most flexible approach. Content creators can use many different licences ranging from ones that protect their work with full copyright to ones that release it to the public domain. If modifications are allowed they will fall under the original license with attribution for the modifier in the list of contributors.
Q: Is Open Source Publishing the same as Open Access Publishing?
A: Open Source is not the same as Open Access. EvoPub is fundamentally not about open access. While open source and open access are somewhat related, they are neither dependent upon one another nor is one necessary for the other. Depending on the specific requirements and needs of a community, an OSP journal could be open access or it could be completely closed. Open source publishing may be a way to move towards open access but this connection is at the discretion of the community.
Q: Will HoSTEM be able to scale to keep up with demand?
A: The hope for HoSTEM is that scaling will become an issue but not a problem. The PIs have intentionally chosen platforms and frameworks on which to build HoSTEM that are being used by much larger sites; pushing the state of the art in scalability should not be necessary. The platforms supporting HoSTEM have proven to be extremely robust and scalable, running all or parts of the web's most popular sites [20].

12. Letters of Commitment

In addition to support from the iMechanica.org community, included are letters of commitment from:
  • Robin Schulze (PSU), Head of the Department of English, who enthusiastically endorses the English undergraduate internships that will be supported; and
  • Thomas Conkling (PSU), Head of the Engineering Library, who strongly supports the move to an economically sustainable model for academic publishing; and
  • Mike Furlough (PSU), Assistant Dean for Scholarly Communications and Co-Director of the Office of Digital Scholarly Publishing, who is interested in integrating the proposed work ongoing data curation efforts; and
  • Steven Gottlieb (IU), Distinguished Professor of Physics and arXiv advisory board member, who will explore integrating the proposed work with new developments at arXiv.org;
  • Bradley Wheeler (IU), Vice President for Information Technology & CIO, who is currently engaged in a project entitled The Big Digital Machine funded by the CIOs and the University Librarians of the CIC (the BigTen+) to develop and deploy next generation scholarly communication services in an inter-organizational collaborative infrastructure that can scale to many disciplines.

13. Development Plan, Metrics, Release Schedule and Licensing

The timeline below presents a rough plan for the PIs to complete work already started on HoSTEM and to continue to develop and promote it. The design, development, and release process will be similar to that used by the PI in other software projects and is typical throughout the (indie) software development community. That is, software will first be tested internally and then rolled out to a larger community for testing. In this case, the community most familiar to the PIs, theoretical mechanics. The backend components will be tested through the use of content generated by the PIs from material at arXiv.org (with permission of the original authors). The MediaWiki extensions and Drupal integration can be thoroughly tested and debugged in this way. The basics of the STEM publishing web application will be designed based on the needs of the PI. Once a prototype (alpha) application is in place, he will gather feedback and feature requests from colleagues. The PIs will solicit beta testers of the basic functionality from the mechanics community via imechanica.org. Once the basic functionality has been thoroughly tested and debugged, the application will be released as a public beta and work will begin on adding and testing user-requested features.
A demonstration of the Living Review and STEMpedia sub-systems is planned for the end of 2011. At this point we would like to have at least 10 somewhat related review articles in the system and at least one STEMpedia article near completion. At the end of the funding period, a beta of the OSP subsystem including the proposed web application for scientific writing will be opened for public testing. The backend components of HoSTEM will be ready for final release at this point and the PIs will develop a user friendly installation package to promote the adoption of EvoPub in other user groups and communities.
All software will be developed and released under the GNU General Public License, the same license used by MediaWiki and Drupal. Throughout the development process the source code will be available in a public repository at GitHub.com.

14. References

[1] [online]Available from: http://www.plosone.org [cited February 25, 2010]. [2] [online]Available from: http://www.atmos-chem-phys.net/volumes_and_issues.html [cited February 25, 2010]. [3] “Nature Precedings,” [online]. Available from: http://precedings.nature.com [cited February 25, 2010]. [4] “Wikipedia,” [online]. Available from: http://www.wikipedia.com/ [cited February 25, 2010]. [5] “280North,” [online]. Available from: http://280north.com/ [cited February 25, 2010]. [6] “Cappuccino Development Framework,” [online]. Available from: http://cappuccino.org/ [cited February 25, 2010]. [7] Z. Suo and T. Li. “iMechanica: web of mechanics and mechanicians,” [online]. Available from: http://imechanica.org [cited 08.01.2008]. [8] “MediaWiki,” [online]. Available from: http://www.mediawiki.org/ [cited February 25, 2010]. [9] C. Nisoli, P. Lammert, E. Mockensturm, and V. Crespi, “Carbon nanostructures as an electromechanical bicontinuum,” Physical Review Letters, vol. 99, p. 045501, 2007. Available from: http://arxiv.org/pdf/0704.1305. [10] T. Bergstrom and P. McAfee. “Journal Cost-Effectiveness Search,” [online]. Available from: http://www.journalprices.com [cited February 25, 2010]. [11] T. Bergstrom and P. McAfee. “Journal Cost-Effectiveness Search: Summary Statistics,” [online]. Available from: http://www.hss.caltech.edu/~mcafee/Journal/Summary.pdf [cited February 25, 2010]. [12] “PLoS IRS Form 990: Return of Organization Exempt From Income Tax,” [online]. Available from: http://www.guidestar.org/FinDocuments/2006/680/492/2006-680492065-02cc2b4b-9.pdf [cited February 25, 2010]. [13] “arXiv.org Business Model,” [online]. Available from: http://arxiv.org/help/support/whitepaper [cited February 25, 2010]. [14] “HUBzero,” [online]. Available from: http://hubzero.org/ [cited February 25, 2010]. [15] “CIC Shared Storage, The Big Digital Machine,” [online]. Available from: http://www.cic.net/Home/Projects/Technology/SharedStorage/Introduction.aspx [cited February 25, 2010]. [16] “Connexions,” [online]. Available from: http://cnx.org/ [cited February 25, 2010]. [17] “DynamicBooks,” [online]. Available from: http://dynamicbooks.com/ [cited February 25, 2010]. [18] “LiquidPub,” [online]. Available from: http://liquidpub.org/ [cited February 25, 2010]. [19] “Creative Commons,” [online]. Available from: http://www.creativecommons.org/about [cited February 25, 2010]. [20] “PHP Usage at Wikipedia,” [online]. Available from: http://en.wikipedia.org/wiki/Php#Usage [cited February 25, 2010].

15. Footnotes

1 1 ^(1){ }^{1}
2 2 ^(2){ }^{2}
3 3 ^(3){ }^{3}

  1. Users are encouraged to pay a scientific editing service to improve the readability. ↩︎

  2. Authors of Applied Mechanics Review articles are not given this permission by ASME. ↩︎

  3. Note that discussion posts can be made either anonymously to attributed. ↩︎

Recommended for you

Juntao Jiang
Group Equivariant Convolutional Networks in Medical Image Analysis
Group Equivariant Convolutional Networks in Medical Image Analysis
This is a brief review of G-CNNs' applications in medical image analysis, including fundamental knowledge of group equivariant convolutional networks, and applications in medical images' classification and segmentation.
9 points
0 issues