Translating User Input This section explains the translation of physical user events into programmatic character strings or special keyboard data (such as "backspace"). This kind of work should be done by toolkits. If you can use a toolkit to manage event processing for you, do so, and blissfully ignore this section. If you are writing a toolkit text object, or are writing a truly extraordinary application, then this section is for you. This section on translating user input covers these topics: * "About User Input and Input Methods" presents an overview of user input and input methods. * "About X Keyboard Support" covers X keyboard support, including keys, keycodes, keysyms, and composed characters. * "Input Methods (IMs)" describes how input methods are opened and closed. * "IM Styles" discusses the use and naming of IM styles. * "Input Contexts (ICs)" explains an IM styles, IC values, pre-edit and status attributes, and creating and using ICs. * "Events Under IM Control" describes differences in processing events under IM control including XFilterEvent() and LookupString routines. About User Input and Input Methods Just as internationalized programs cannot assume that data is in ASCII, they cannot assume that user input will use any specific keyboard. Keyboards change from country to country and language to language; internationalized software should never assume that a certain position on the keyboard is bound to a certain character, or that a given character will be available as a single keystroke on all keyboards. No useful physical keyboard - not even one specifically designed for multilingual work - could possibly contain a key for every character we would ever wish to type. Certainly there are characters commonly used in other areas of the world that are not present on most USA keyboards. So methods have been invented that provide for input of almost any known character on even the most naïve keyboards. These schemes are referred to as input methods (IMs). Input methods vary significantly in design, use, and behavior, but there is a single API that developers use to access them. The object is for the application simply to ask for an IM and let the system check the locale and choose the appropriate IM. Some IMs are complex; others are very simple. The API is designed to be a low-level interface, like Xlib. Usually, only toolkit text object authors must deal with the IM interfaces. However, some applications developers are unable to use toolkit objects, so the concepts are described here. Reuse Sample Code A sample program demonstrating some of the concepts in this section is given in Chapter 11 of the Xlib Programming Manual, Volume One. Looking carefully at that code may be easier than starting from scratch. GL Input The old GL function qdevice() has a hard-coded view of a keyboard (see /usr/include/gl/device.h for details). Some flexibility, particularly for Europe, is available if you queue KEYBD instead of individual keys, but the GL has no general solution to non-ASCII input. There is no supported way to input Chinese (for instance) to the old GL. OpenGL does not contain input code but leaves that to the operating environment, which in IRIX means X. In short, support for internationalized input means a departure from qread(). Under IRIX, that means using mixed-model input, all the more reason to use a toolkit. About X Keyboard Support This section provides some background that may help make the following sections easier to understand. Keys, Keycodes, and Keysyms When a client connects to the X server, the server announces its range of keycodes and exports a table of keysyms. Each key event the client receives has a single byte keycode, which directly represents a physical key, and a single byte state, which represents currently engaged modifier keys, such as Shift or Alt. Note: The mapping of state bits to modifiers is done by another table acquired from the server. Keysyms are well defined, and there has been an attempt to have a keysym for every engraving one might possibly find on any keyboard, anywhere. (An engraving is the image imprinted on a physical key.) These are contained in /usr/include/X11/keysymdef.h. Keysyms represent the engravings on the actual keys, but not their meanings. The server's idea of the keysym table can be changed by clients, and clients may receive KeyMap events when this remapping happens, but such events don't happen often. When a client receives a Key event, it asks Xlib to use the keycode to index into its keysym table to find a list of keysyms. (This list is usually very short. Most keys have only one or two engravings on them.) Using the state byte, Xlib chooses a keysym from the list to find out what was engraved on the key the user pressed. At this point, the client can choose to act on the keysym itself (if, for instance, it was a backspace) or it can ask for a character string represented by the keysym (or both). Generating such a string is tricky; it is discussed in "Input Methods (IMs)," below. Details on X keyboard support can be found in X Window System, Third Edition, from Digital Press. Details on input methods are also available in that book, as well as in the Xlib Programming Manual, Volume One. Composed Characters There are two ways to compose characters that do not exist on a keyboard: explicit and implicit. It is common for an application to be modal and switch between the two. For example, Japanese input of kana is often done via implicit composition. Users switch between a mode where input is interpreted as romaji (Latin characters) and a mode where input is translated to kana. Furthermore, both styles may operate simultaneously. While an application is supporting implicit composition of certain characters, other characters may be composable via explicit composition. Not every keystroke produces a character, even if the associated keysym normally implies character text. The event-to-string translation routines figure out what result a given set of keystrokes should produce (see "Using XLookupString(), XwcLookupString(), and XmbLookupString()" in this section). Character composition from the user's aspect is discussed in the compose(5) and composetable(5) reference pages. Explicit Composition Explicit composition is requested when the user presses the Compose key and then types a key sequence that corresponds to the desired character. For example, to compose the character ñ under some keymaps, you might press the Compose key and then type ~n. Note: The xmodmap(1) reference page tells how to map the XK_Multi_key keysym onto whatever key you want to use as Compose. Implicit Composition Implicit composition mimics many existing European typewriters that have "dead" keys: keys that type a character but do not advance the carriage. When a special "dead" key is struck, the system attempts to compose a character using the next character struck. For example, on a keyboard that had a diaeresis (¨) and an O, but no Ö, you would strike ¨ and then O to compose Ö. Implicit composition support usually comes with some specified way to leave characters uncomposed. Supported Keyboards IRIX currently supports 16 keyboard layouts: American, Belgian, Czech, Danish, English, French, German, Italian, Norwegian, Polish, Portuguese, Russian, Spanish, Swedish, Swiss and Turkish. The American keyboard needs only ASCII. Input Methods (IMs) Input methods (IMs) are ways to translate keyboard-input events into text strings. You would use a different input method, for instance, to type on a USA keyboard in Chinese than to type on the same keyboard in English. Nobody would build a keyboard suitable for direct input of the tens of thousands of distinct Chinese characters. IMs come in two flavors, front-end and back-end. Both types can use identical application programming interfaces, so you lose no generality by using back-end methods for our examples here. To use an IM, follow these steps: 1. Open the IM. 2. Find out what the IM can do. 3. Agree upon capabilities to use. 4. Create input contexts with preferences and window(s) specified (see "Input Contexts (ICs)"). 5. Set the input context focus. 6. Process events. Although all applications go through the same setup when establishing input methods, the results can vary widely. In a Japanese locale, you might end up with networked communications with an input method server and a kanji translation server, with circuitous paths for Key events. But in a Swiss locale for example, it is likely that nothing would occur besides a flag or two being set in Xlib. Since operating in non-Asian locales ends up bypassing almost all of the things that might make input methods expensive, Western users are not noticeably penalized for using Asia-ready applications. Opening an Input Method XOpenIM() opens an input method appropriate for the locale and modifiers in effect when it is called (see the XOpenIM(3X11) reference page). The locale is bound to that IM and cannot be changed. (But you could open another IM if you wanted to switch later.) Strings returned by XmbLookupString() and XwcLookupString() are encoded in the locale that was current when the IM was opened, regardless of current input context. The syntax is XIM XOpenIM(Display *dpy, XrmDataBase db, char *res_name, char *res_class); The res_name is the resource name of the application, res_class is the resource class, and db is the resource database that the input method should use for looking up resources private to itself. Any of these can be NULL. The fragment in Example 16-7 shows how easy it is to open an input method. Example 16-7 : Opening an IM XIM im; im = XOpenIM(dpy, NULL, NULL, NULL); if (im == NULL) exit_with_error(); XOpenIM() finds the IM appropriate for the current locale. If XSupportsLocale() has returned good status (see "Initialization for Xlib Programming") and XOpenIM() fails, something is amiss with the administration of the system. XSetLocaleModifiers() determines configure locale modifiers. The local host X locale modifiers announcer (the XMODIFIERS environment variable) is appended to the modifier list to provide default values on the locale host. The modifier list argument is a null-terminated string containing zero or more concatenated expressions of this form: @category=value For example, if you want to connect Input Method Server xwnmo, set modifiers _XWNMO as follows: XSetLocaleModifiers("@im=_XWNMO"); Or, set environment variable XMODIFIERS to the string @im=_XWNMO and execute XSetLocaleModifiers(""); Note: The library routines are not prepared for the possibility of XSupportsLocale() succeeding and XOpenIM() failing, so it's up to application developers to deal with such an eventuality. (This circumstance could occur, for example, if the IM died after XSupportsLocale() was called.) This topic is under some debate in the MIT X consortium. If XSetLocaleModifiers() is wrong, XOpenIM() will fail. Most of the complexity associated with IM use comes from configuring an input context to work with the IM. Input contexts are discussed in "Input Contexts (ICs)". To close an input method, call XCloseIM(). IM Styles If the application requests it, an input method can often supply status information about itself. For example, a Japanese IM may be able to indicate whether it is in Japanese input mode or romaji input mode. An input method can also supply pre-edit information, partial feedback about characters in the process of being composed. The way an IM deals with status and pre-edit information is referred to as an IM style. This section describes styles and their naming. Root Window The Root Window style has a pre-edit area and a status area in a window owned by the IM as a descendant of the root. The application does not manage the pre-edit data, the pre-edit area, the status data, or the status area. Everything is left to the input method to do in its own window, as illustrated in Figure 16-1. [TIP_6.intl-1.gif] Figure 16-1 : Root Window Input Off-the-Spot The Off-the-Spot style places a pre-edit area and a status area in the window being used, usually in reserved space away from the place where input appears. The application manages the pre-edit area and status area, but allows the IM to update the data there. (The application provides information regarding foreground and background colors, fonts, and so on.) A window using Off-the-Spot input style might look like that shown in Figure 16-2. [TIP_6.intl-2.gif] Figure 16-2 : Off-the-Spot Input Over-the-Spot The Over-the-Spot style involves the IM creating a small, pre-edit window over the point of insertion. The window is owned and managed by the IM as a descendant of the root, but it gives the user the impression that input is being entered in the right place; in fact, the pre-edit window often has no borders and is invisible to the user, giving the appearance of On-the-Spot input. The application manages the status area as in Off-the-Spot, but specifies the location of the editing so that the IM can place pre-edit data over that spot. On-the-Spot On-the-Spot input is by far the most complex for the application developer. The IM delivers all pre-edit data via callbacks to the application, which must perform in-place editing - complete with insertion and deletion and so on. This approach usually involves a great deal of string and text rendering support at the input generation level, above and beyond the effort required for completed input. Since this may mean a lot of updating of surrounding data or other display management, everything is left to the application. There is little chance an IM could ever know enough about the application to be able to help it provide user feedback. The IM therefore provides status and edit information via callbacks. Done well, this style can be the most intuitive one for a user. Setting IM Styles A style describes how an IM presents its pre-edit and status information to the user. An IM supplies information detailing its presentation capabilities. The information comes in the form of flags combined with OR. The flags to use with each style are as follows: Root Window XIMPreeditNothing | XIMStatusNothing Off-the-Spot XIMPreeditArea | XIMStatusArea Over-the-Spot XIMPreeditPosition | XIMStatusArea On-the-Spot XIMPreeditCallbacks | XIMStatusCallbacks For example, if you wanted a style variable to match an Over-the-Spot IM style, you could write: XIMStyle over = XIMPreeditPosition | XIMStatusArea; If an IM returns XIMStatusNone (not to be confused with XIMStatusNothing), it means the IM will not supply status information. Using Styles An input method supports one or more styles. It's up to the application to find a style that is supported by both the IM and the application. If several exist, the application must choose. If none exist, the application is in trouble. Input Contexts (ICs) An input method may be serving multiple clients, or one client with multiple windows, or one client with multiple input styles on one window. The specification of style and client/IM communication is done via input contexts. An input context is simply a collection of parameters that together describe how to go about receiving and examining input under a given set of circumstances. To set up and use an input context: 1. Decide what styles your application can support. 2. Query the IM to find out what styles it supports. 3. Find a match. 4. Determine information that the IC needs in order to work with your application. 5. Create the IC. 6. Employ the IC. Find an IM Style The IM may be able to support multiple styles - for example, both Off-the-Spot and Root Window. The application may be able to do, in order of preference, Over-the-Spot, Off-the-Spot, and Root Window. The application should determine that the best match in this case is Off-the-Spot. First, discover what the IM can do, then set up a variable describing what the application can do, as shown in Example 16-8. Example 16-8 : Finding What a Client Can Do XIMStyles *IMcando; XIMStyle clientCanDo; /* note type difference */ XIMStyle styleWeWillUse = NULL; XGetImValues(im, XNQueryInputStyle, &IMcando, NULL); clientCanDo = /*none*/ XIMPreeditNone | XIMStatusNone | /*over*/ XIMPreeditPosition | XIMStatusArea | /*off*/ XIMPreeditArea | XIMStatusArea | /*root*/ XIMPreeditNothing | XIMStatusNothing; A client should always be able to handle the case of XIMPreeditNone | XIMStatusNone, which is likely in a Western locale. To the application, this is not very different from a RootWindow style, but it comes with less overhead. Once you know what the application can handle, look through the IM styles for a match, as shown in Example 16-9. Example 16-9 : Setting the Desired IM Style for(i=0; i < IMcando->count_styles; i++) { XIMStyle tmpStyle; tmpStyle = IMcando->support_styles[i]; if ( ((tmpStyle & clientCanDo) == tmpStyle) ) styleWeWillUse = tmpStyle; } if (styleWeWillUse = NULL) exit_with_error(); XFree(IMcando); /* styleWeWillUse is set, which is what we were after */ IC Values There are several pieces of information an input method may require, depending on the input context and style chosen by the application. The input method can acquire any such information it needs from the input context, ignoring any information that does not affect the style or IM. A full description of every item of information available to the IM is supplied in X Window System, Third Edition. The following is a brief list: XNClientWindow Specifies to the IM which client window it can display data in or create child windows in. Set once and cannot be changed. An additional event mask for event selection on the client window. XNFilterEvents The window to receive processed (composed) Key events. XNFocusWindow A geometry handler that is called if the client allows an IM to change the XNGeometryCallback geometry of the window. Specifies the style for this IC. XNInputStyle XNResourceClass, XNResourceName The resource class and name to use when the IM looks up resources that vary by IC. XNStatusAttributes, The attributes to be used for any XNPreeditAttributes status and pre-edit areas (nested, variable-length lists). Pre-Edit and Status Attributes When an IM is going to provide state, it needs some simple X information with which to do its work. For example, if an IM is going to draw status information in a client window in an Off-the-Spot style, it needs to know where the area is, what color and font to render text in, and so on. The application gives this data to the IC for use by the IM. As with the "IC Values" section, full details are available in X Window System, Third Edition. XNArea A rectangle to be used as a status or pre-edit area. The rectangle desired by the attribute writer. Either the application or the IM XNAreaNeeded may provide this information, depending on circumstances. A pixmap to be used for the background of windows the IM creates. XNBackgroundPixmap The colormap to use. XNColormap The cursor to use. XNCursor The fontset to use for rendering text. XNFontSet XNForeground, XNBackground The colors to use for rendering. The line spacing to be used in the pre-edit window if more than one line is used. XNLineSpacing Specifies where the next insertion point is, for use by XIMPreeditPosition styles. XNSpotLocation XNStdColormap Specifies that the IM should use XGetRGBColormaps() with the supplied property (passed as an Atom) in order to find out which colormap to use. Creating an Input Context Creating an input context is a simple matter of calling XCreateIC() with a variable-length list of parameters specifying IC values. Example 16-10 shows a simple example that works for the root window. Example 16-10 : Creating an Input Context With XCreateIC() XVaNestedList arglist; XIC ic; arglist = XVaCreateNestedList(0, XNFontSet, fontset, XNForeground, WhitePixel(dpy, screen), XNBackground, BlackPixel(dpy, screen), NULL); ic = XCreateIC(im, XNInputStyle, styleWeWillUse, XNClientWindow, window, XNFocusWindow, window, XNStatusAttributes, arglist, XNPreeditAttributes, arglist, NULL); XFree(arglist); if (ic == NULL) exit_with_error(); Using the IC A multi-window application may choose to use several input contexts. But for simplicity, assume that the application just wants to get to the internationalized input using one method in one window. Using the IC is a matter of making sure you check events the IC wants, and of setting IC focus. If you are setting up a window for the first time, you know the event mask you want, and you can use it directly. If you are attaching an IC to a previously configured window, you should query the window and add in the new event mask. Example 16-11 : Using the IC unsigned long imEventMask; XGetWindowAttributes(dpy, win, &winAtts); XGetICValues(ic, XNFilterEvents, &imEventMask, NULL); imEventMask |= winAtts.your_event_mask; XSelectInput(dpy, window, imEventMask); XSetICFocus(ic); At this point, the window is ready to be used. Events Under IM Control Processing events under input method control is almost the same in X11R6 as it was under R4 and before. There are two essential differences: the XFilterEvent() and X*LookupString() routines. Using XFilterEvent() Every event received by your application should be fed to the IM via XFilterEvent(), which returns a value telling you whether or not to disregard the event. IMs asks you to disregard the event if they have extracted the data and plan on giving it to you later, possibly in some other form. All events (not just KeyPress and KeyRelease events) go to XFilterEvent(). If you compacted the event processing into a single routine, a typical event loop would look something like the code in Example 16-12. Example 16-12 : Event Loop Xevent event; while (TRUE) { XNextEvent(dpy, &event); if (XFilterEvent(&event, None)) continue; DealWithEvent(&event); } Using XLookupString(), XwcLookupString(), and XmbLookupString() When using an input method, you should replace calls to XLookupString() with calls to XwcLookupString() or XmbLookupString(). The MB and WC versions have very similar interfaces. The examples below arbitrarily use XmbLookupString(), but apply to both versions. There are two new situations to deal with: 1. The string returned may be long. 2. There may be an interesting keysym returned, an interesting set of characters returned, both, or neither. Dealing with the former is a matter of maintaining an arena, as in Example 16-13. To tell the application what to pay attention to for a given event, XmbLookupString() returns a status value in a passed parameter, equal to one of the following: XLookupKeysym Indicates that the keysym should be checked. Indicates that a string has been typed or composed. XLookupChars Means both of the above. XLookupBoth Means neither is ready for processing. XLookupNone Means the supplied buffer is too small - call XmbLookupString() again with a bigger buffer XBufferOverflow XmbLookupString() also returns the length of the string in question. Note that XmbLookupString() returns the length of the string in bytes, while XwcLookupString() returns the length of the string in characters. The example below should help show how these functions work. Most event processors perform a switch on the event type; assume you have done that and have received a KeyPress event. Example 16-13 : KeyPress Event case KeyPress: { Keysym keysym; Status status; int buflength; static int bufsize = 16; static char *buf = NULL; if (buf == NULL) { buf = malloc(bufsize); if (buf < 0) StopSequence(); } buflength = XmbLookupString(ic, &event, buf, bufsize, &keysym, &status); /* first, check to see if that worked */ if (status == XBufferOverflow) { buf = realloc(buf, (bufsize = buflength)); buflength = XmbLookupString(ic, &event, buf, bufsize, &keysym, &status); } /* We have a valid status. Check that */ switch(status) { case XLookupKeysym: DealWithKeysym(keysym); break; case XLookupBoth: DealWithKeysym(keysym); /* **FALL INTO** charcter case */ case XLookupChars: DealWithString(buf, buflength); case XLookupNone: break; } /* end switch(status) */ } /* end case KeyPress segment */ break; /* we are in a switch(event.type) statement */