To address the problem of crashes in the layouter we have performed fuzzy testing. We send random events to the editor and tracking crashes in javascript. This allowed us to fix many hard-to-find bugs in the editor.

We have finally completed a huge amount of work, related to server-side PDF generation. Opensource library, which we used, had a problem with interlaced PNGs, so we had to replace PNG reading library inside it. Also we have done much efforts to cache TTF information from /fonts folder, added few heuristics for replacing fonts, when some of them are missing. Also solved some performance issues during PDF generations (we join some continuous runs together to gain performance). All of this move us towards high-fidelty DOCX2PDF convertor (thought some elements of DOCX are not supported yet):

We have supported linespacing and many fidelity details for enumerations, including correct indentations, tab spacings and so on. Also support for bold (and some other ‘toggle’ attributes) turned out to be quite complex. DOCX specification has few pages, explaining behaviour of such attributes and it was still not clear how to calculate simple bold attribute in final layout. We have automatically generated test 40-page DOCX file with tons of combinations of bolds (in styles and run attributes) and managed to render it correctly finally. So now we have some more tests.

Its time to write tests. We are gradually moving from alpha to beta stage. Many times refactoring caused some other functions to crash. So from this moment we have selenium tests, which open browser, upload known document, make some changes and expect certain results. We have even supported some features for collaborative editing. Also we have smoke tests, they try to parse (and convert to PDF) all documents in our test folder. So if there is any buggy code, which may crash the layouter, we will know about this.

Tons of refactorings again, as usual. Instead of doing parseXml(toXml(doc)) on server, we are cloning it in a clever way. This gave us some more perfomance on huge documents, but we plan to get rid of this construction at all in the future. We have totally refactored XML handling code, now it is clean and easy-to-read, this was a huge job. Also fixed some memory leaks caused by paragraph cache and collaborative editing algorithm.

Alignment. User can align text to the right, to the center and to the left. Feature that was always postponed. Also we have a nice caption in the toolbar, it allows us to set document title. It will be used when the document will be downloaded or exported.

We have supported copy to HTML and also paste from HTML. This allows to exchange text with formatting between CollabOffice and MSWord

We have improved our scrollbars, so they support drag-n-drop, but not only clicking. Yet the bottom scrollbar is yet not perfect.

A lot of time was invested into cross-browser XML support. Yes, js code which parses XML would behave differently on different browsers in general case. We have refactored entire codebase to use only small support of XML api, so it behaves the same way on all browsers and also on the server.

We have supported server-side rendering to PNG and PDF. Had to perform huge refactoring to keep codebase clean. Currently text is saved as set of glyphs and that would probably force us to migrate to another rendering library, but for the first iteration it is ok.

Refactoring, refactoring, refactoring… Many perfomance issues were fixed, now editor is much faster in mozilla because it relayouts content only on document change, fixed image resizing bug. Now we are collecting methods into classes like Layout, Font, etc to reduce js importing code and perform layout on server for testing.

I am able to render my resume. It doesn’t look perfect yet, but the majority of elements is there and you can edit the text.

Among other things we have implemented undo layer. Each modification to the document is not being tracked and recorded, this we help us to support undo function during collaboration and make some other improvements in the kernel.

Now we have support for embedding into thirdparty websites. So if anyone wants to have our editor on their side, he may put few instructions and have it there. Please note that it is not enough to add iframe, browsers have security restrictions. To overcome such restrictions we are using JSONP and many other small tricks.

Today working on baseline alignment, i’ve noticed but in MSWord and this is not the first one :)

We have implemented simple HTML toolbar. We put only simple functions here, such as alignment, images, tables, etc. Functions, which we have on our toolbar are scope of our work for beta version. Average user wouldn't need more. But later, after beta version, of course, we will support as many DOCX functions as possible.

Now we support simple bullets and enumerations from DOCX files. They look the same as MSWord’s ones, but of course only for simple cases yet. Also we have discussed how to implement multiuser collaboration with undo support. We know how to do this, at least theoretically.

We continue development of our cool editor. Now we have beta support for text floating over images. Some letters are not copletely WYSIWYG compared to MSWord, because there are some spacing around images, which we currently ignore.

We have started a season of huge refactorings. We have completely refactored document representation, so the entire DOCX file with all its parts is stored inside browser. This allows us to ignore elements which we don’t support yet and save document back after editing. All of this doesn’t affect collaborative editing. If document was not changed, DOCX file is the same.

Added a small interface for file uploading and selecting. We are not at stage of automated testing, so right now it is yet enough to click few documents and make sure that nothing was broken

We are constantly improving as usual, here are some things, that we have done:

  • basic table support, tables can span across many pages, this is not an easy thing to implement
  • gui for image uploading
  • recursive object layout (i.e. images inside tables and so on). this is in very alpha stage, we’ll obviously have to return to it

 We have created a nice toolbar and also supported italic, underline plus font sizes. We have implemented paragraph caching to improve performance with huge documents

 We are starting our official blog and will keep you informed about progress of our project. The project has just started, but we already have a good progress. By this time we have managed to create stable algorithm for collaborative editing. We have a small prototype which allows to load DOCX file, unzip it inside browser, and draw it with multiuser support. At this step we are very limited in functions, we support only bold style and only one font.