Go back >> home >> useful >> TeX/LaTeX

Generate Chinese PDFs on Unix w/o Chinese support?

Sounds impossible?

People know how to generate PDF's containing CJK (Chinese, Japanese, and Korean) characters under Windows and Linux. And the generated PDF's support search, copy and paste of CJK text. But on a Unix machine without any CJK support? How can you do that?

Emacs is what you need to edit Chinese (and JK too) text files. Emacs (21.2 or with leim) has built-in functionalities for processing encodings and decodings of Chinese, as well as many other script text. And it runs on almost all platforms. You need proper fonts in order to get your text displayed properly though. Here is a simple step-by-step tutorial on how to edit Chinese text file without doing any modification to your .emacs file.
  1. run Emacs.
  2. input ``Ctrl-x RET c''
  3. input ``chinese-iso-8bit'' with the help of Tab/Space keys of course. Actually ``cn-gb'' is an alias of ``chinese-iso-8bit'' so you can use that too.
  4. open your file (it will be created if it doesn't exist.) using ``Ctrl-x Ctrl-f''.
  5. input ``Ctrl-x RET Ctrl-\'' to select an input method.
  6. input ``chinese-py-punct''
  7. you can input chinese now and use `v' to prefix punct marks.
  8. ``Ctrl-\'' is used to switch between English/Chinese input mode.
  9. when you finished, just use ``Ctrl-x Ctrl-s'' to save. Nothing special.


To use CJK with LaTeX we need latex-cjk package. And dvipdfmx can embed TrueType Fonts into PDF files it generates. In this way the produced PDF files support searching and copying of Chinese text.
I assume that you already have teTeX (2.0beta or above) installed, along with the header files and lib files of kpathsea (Normally they are under teTeX direcotry.) Other TeX/LaTeX distributions might work in a similar way too but I don't have them so can not guarantee it.

  1. Prepare a directory under your $HOME for local TeX/LaTeX installation. For example you can create a directory ``texmf''. From now on we will refer this directory as your $TEXMFLOCAL. You need to define the environment variable $TEXMFLOCAL to the directory you prepared before proceeding to next step.
  2. Download latex-cjk package from http://cjk.ffii.org/ and install it. This is just an easy task. Just unpack the .tar.gz file and copy the directory ``texinput'' under your ``$TEXMFLOCAL/tex/latex'' and rename it as ``CJK''. (Actually you don't need all its subdirectories but since they don't occupy too much space, you could just leave them there.) You can install the doc of the package too if you like. Then do ``texhash''. You should see texhash is updating your $TEXMFLOCAL/ls-R file. If not, something went wrong and you need to check that. You can now compile a mini test file like
    \documentclass{article}
    \usepackage{CJK}
    \begin{document}
    \begin{CJK*}{GBK}{song}
    这是中文宋体字。
    A wonderful world!
    \end{CJK*}
    \end{document}
    
  3. Input ``latex test''. Compile error, huh? LaTeX complains that ``Metric (TFM) file not found.'' At this step we need to add at least TFM files for Chinese fonts, in order to go on.
    1. Find ``gbkfonts'' and produce all needed files.
    2. Or download from here. I only include TFM files because they are enough for us to generate PDFs by using divpdfmx. If you want to generate .ps files with chinese characters, you need either ``gbkfonts'' or Type 1 font files from other sources.
    3. uncompress the the above .tar.gz file you will get a ``chinese'' directory. Move it under your ``$TEXMFLOCAL/fonts/tfm''.
    4. another ``texhash''.
    5. ``latex test'' again.
  4. Still get error? What's wrong? It's the ``c19song.fd'' coming with latex-cjk that needs to be patched! Open ``$TEXMFLOCAL/tex/latex/CJK/GB/c19song.fd'' in your favorite editor and make changes like this c19song.fd. You would need to generate font definition files for other fonts. Simply replace all occurence of ``song'' with ``hei'' in ``c19song.fd'' and save it as ``c19hei.fd'', you got .fd file for hei. Repeat this for fs, kai, li, you.
  5. Now try ``latex test'' again, you should be able to get a .dvi without any error.
  6. Download dvipdfmx from The Dvipdfmx Project page and install it.
    1. unpack the source and specify a prefix for ./configure script. For example, ``./configure --prefix=/usr/myotherlocal'' . The default prefix is ``/usr/local''. You need to have libkpathsea to compile dvipdfmx. It comes with teTeX 2.0 and above.
    2. ``gmake install''.
    3. ``cp -R /usr/local/share/texmf/dvipdfm $TEXMFLOCAL''. Use your own directory if it is different.
    4. append the following lines to the end of ``$TEXMFLOCAL/dvipdfm/config/cid-x.map''.
      gbksong@UGBK@             UniGB-UCS2-H    :0:simsun.ttf
      gbksongsl@UGBK@             UniGB-UCS2-H    :0:simsun.ttf, Italic
      gbkhei@UGBK@             UniGB-UCS2-H    :0:simhei.ttf
      gbkheisl@UGBK@             UniGB-UCS2-H    :0:simhei.ttf, Italic
      gbkkai@UGBK@             UniGB-UCS2-H    :0:simkai.ttf
      gbkkaisl@UGBK@             UniGB-UCS2-H    :0:simkai.ttf, Italic
      gbkfs@UGBK@             UniGB-UCS2-H    :0:simfang.ttf
      gbkfssl@UGBK@             UniGB-UCS2-H    :0:simfang.ttf, Italic
      gbkli@UGBK@             UniGB-UCS2-H    :0:simli.ttf
      gbklisl@UGBK@             UniGB-UCS2-H    :0:simli.ttf, Italic
      gbkyou@UGBK@             UniGB-UCS2-H    :0:simyou.ttf
      gbkyousl@UGBK@             UniGB-UCS2-H    :0:simyou.ttf, Italic	
      	
    5. do another ``texhash''. Now you can run ``dvipdfmx test'' on the test.dvi we generated before. Not surprisingly you will get an error: (Could not open the SubFont Definition file 'UGBK.sfd'). We defined ``UGBK'' in cid-x.map but we haven't got it! You can get it from freetype1-contrib package under ttf2pk/data directory (an example links here). Put it under ``$TEXMFLOCAL/ttf2pk/base''. Besides this file, you also need ``Adobe-GB1-UCS2'', ``UniGB-UCS2-H'', and ``UniGB-UCS2-V''. You should be able to find them under the Acrobat Reader directory installation directory (I know windows version has it and I assume linux version has it too). Or here. These three files are put into ``$TEXMFLOCAL/dvipdfm/CMap''. Now do a ``texhash''.
  7. The last thing we need is the TrueType fonts!
    1. Get those .ttf files and put them under ``$TEXMFLOCAL/fonts/truetype''. Be careful to make sure the file names (all lower cases) are same as specified in cid-x.map.
    2. Copy the file ``$TEXMF/web2c/texmf.cnf'' to ``$TEXMFLOCAL/web2c''. Find the line about Truetype file path and add your own path. like
      TTFONTS = .;$TEXMF/fonts/truetype//;/home/user/texmf/fonts/truetype//	
      	
    3. A final ``texhash'' and we are done! Run ``dvipdfmx -vv test'' and you should be able to get a beautiful PDF file. (-vv option turn on the verbose mode so that you can see much more information. You can omit it if you don't like that.)
That's it! Enjoy your success! I have a sample file test.pdf and its source file test.tex here.

Go back >> home >> useful >> TeX/LaTeX
Please send questions, comments, and suggestions to ymeng_9 at etang dot com