This article describes our ups and downs using composition events (DOM Level 3 Events) on Chrome running on Android. So if you ever wondered why there is no fully supported Android version of OX Text, you can read the background story now.
Composition events are used by a browser to notify text input in a supplementary or alternate manner than by keyboard events. This might be add accents to characters, create logograms of many asian languages, select word suggestions from a mobile device on-screen keyboard, auto-correction, speech recognition converting voice into text. In this context an IME (input method editor) is an important part, when people want to type many asian languages. An IME is an application that allows a standard keyboard to be used to type characters and symbols that are not directly represented on the keyboard itself. An IME is normally part of the user’s operating system and is not a specific part of the browser. Chrome normally uses the composition event for most of the text input methods, e.g. Swype, normal keys on the virtual keyboard, auto-completion and word-suggestions.
The W3C working draft “DOM Level 3 Events” contains specific events to support all these alternate ways to input text, called composition events (http://www.w3.org/TR/DOM-Level-3-Events/#events-compositionevents).
OX Document applications use a quite different approach to track changes on the document. Changes are detected and converted into operations, which describe what has been done. You can find more information on this approach in the article OX Documents – Roundtrip and Operations.
OX Text uses the HTML contenteditable attribute to enable the user to input text. The keyboard and composition events are captured to detect what has happened and to convert the text input into specific operations (normally into insertText or delete operations). Keyboard events can be easily processed using this approach by calling preventDefault(), create the specific operations and process these operations, which modify the DOM and are sent to the server-side. Well, “easily” is not true for Chrome & Android, where we had to wait about one year until the Chromium team fixed a major issue with key codes in that environment. Unfortunately not all issues in that area were fixed until today. So the major point is, that we don’t want that the browser manipulates our DOM directly, but do this via our operations.
Starting to support the composition events to use the same approach was very tricky and revealed many problems/bugs in the Chrome implementations.
The composition/input event order (according to the W3C working draft) is defined as follows:
The first idea would be to use the beforeinput event to call preventDefault() and retrieve the text from the data attribute sent with compositionend. This can be used to create the specific operations, make the DOM modifications and send it to the server. Unfortunately Chrome and other browsers don’t support this event, see open issue in the Chromium project here. It looks like that it’s not clear, if this event will be ever implemented, because it’s a matter of controversial debate. The compositionend event is not cancelable and therefore also not a possible solution.
So the easiest approach is not possible and other ideas have to be used. The next idea we tried to use was to create a DOM element, set the focus inside as soon as the compositionstart has been notified. The user changes are done in a secure environment and the text can be retrieved after the input event was received. This is also not possible as Chrome doesn’t accept setting the focus while the composition has been started.
The last approach we tried to use: Just let the browser write into the DOM and find out what the user did using the compositionend event data. In the end of a composition we remove the text written to the DOM and create the operations to make the final changes. You can guess the outcome of this approach, it failed again due to problems with the Chrome and handling compositions. In rare cases the browser lies and does not provide the inserted text in the compositionend event data. This results in a difference between the real DOM and the operations, which describe the changes done by the user. We stopped our efforts trying to workaround this again using a diff approach, which detects the changes done after a composition session.
Currently we try to use a hidden textarea element, where the user inputs the text. At the end of a composition we extract the text and create the specific operations. This approach needs additional support to display a artificial text cursor, enhancements for clipboard etc. The first experience looks promising and we are working on it to enable OX Text without any restrictions on Android.
Conclusion:
Using the composition events with contenteditable=true is a major challenge on Android with Chrome. Chrome is not a bad example, every browser has its bugs, inconsistency in the implementation of a specification. Firefox has much more problems with composition on Android. Safari on iOS is inconsistent: IME works good with composition, but why Siri modifies the DOM without using the composition events is something I don’t understand.
It’s a shame that the W3C specifications “DOM Level 3 Events”, which was created September 2000, is even today not correctly and consistently implemented by major browsers. A new specification for IME input exists, which should provide more flexibility to developers, enhancing the DOM Level 3 events. Hopefully this API will put more focus on composition in the browsers, lead to a better implementation and issues reported long ago will be fixed.