[OS X] accents, etc., in EndNote exported to Bibtex
Andy Jacobson
andyj at splash.princeton.edu
Thu Oct 13 19:38:55 EDT 2005
Howdy,
I use EndNote to import and store my references. Occasionally I
use the apple-K feature to copy a formatted reference to put into an
email or something. But principally, I write in Latex. Therefore I
export the references to Bibtex format.
With EN7, I used a perl script to notice accented characters and
replace them with tex equivalents. This involved opening the
exported data file in emacs, looking for an accented character and
using C-x = to find the hex code of the special character. Then I'd
put a line into my perl filter that would replace instances of that
hex code with the desired tex code (e.g. s/\x88/{\\\'a}/g; would
replace hex character 0x88 with {\'a}, the tex code for a with an
acute accent.
I also used this perl script to replace instances of "co2" with
{CO}$_2$, etc.
EN8 now outputs Unicode utf-8 for most special characters. My
understanding is that these are multi-byte codes. I started to
replace all my filter lines with new multibyte codes, but rapidly
tired of it. I discovered, however, that TeX can handle utf-8
input. If you put the right magic in the tex preamble, you can use
the output of endnote export directly! Without further ado, the
magic is:
\usepackage[utf8]{inputenc}
While this deals with just about everything, I noticed that the
degrees symbol as exported by EN8 isn't recognized as valid utf-8,
and causes a tex error. For this reason, and because I still need to
format things like {CO}$_2$, I still use the perl filter. I attach
it to this email. Usage instructions are on the first (commented)
line. It is called "en2bib.pl".
Best,
Andy
---cut here-----------------------------------------------------
# perl en2bib.pl < en.bib > arj.bib
while(<>) {
s/\xef\xbb\xbf//g; # first few bytes in file are weird crap.
s/\r/\n/g;
s/DOI/Doi/g; # uncomment this if the bst supports it
# s/DOI/Note/g; # use this if the bst does not support DOI
s/\{Manuscript submitted to (.*)\}/{{S}ubmitted to {\\it \1}}/g;
s/DC\*/\${\\Delta}\${C}*/ig;
s/p[cC][oO]2/\$p\$\{CO\}\$_2\$/g;
s/[cC][oO]2/\{CO\}\$_2\$/g;
s/[oO]2/\{O\}\$_2\$/g;
s/[nN]2/\{N\}\$_2\$/g;
s/d13C/\$\\delta\^\{13\}\\text\{C\}\$/g;
s/13C/\$^13\\text\{C\}\$/g;
s/12C/\$^12\\text\{C\}\$/g;
s/14C/\$^14\\text\{C\}\$/g;
s/\xc2\xb0(.)/\$^\\circ\${$1}/g; # degrees, ensure uppercase of
N,E,W,S with {}
s/[eE]l [nN]i.o/{El}~{Ni{\\~n}o}/g;
s/Transcom/{TransCom}/ig;
s/Pacific/{Pacific}/g;
s/Atlantic/{Atlantic}/g;
s/Indian/{Indian}/g;
s/Antarctic/{Antarctic}/g;
s/Arctic/{Arctic}/g;
s/Europ/{Europ}/g;
s/Amazon/{Amazon}/g;
s/Asia/{Asia}/g;
s/Americ/{Americ}/g;
s/([\:\?\.]) *([A-Z])/\1 \{\U\2\E}/g; # uppercase 1st letter
of sentence after : . or ?
s/([A-Z]{2,})/{\1}/g; # upper-case acronyms (two or more
letters long)
s/(Doi = \{[0-9]{2}\.[0-9]{4}\/[0-9]{4})\{([A-Z]{2,})\}([0-9]*
\},)/\1\2\3/g; # remove {} from letters in DOI field
print $_;
}
---cut here-----------------------------------------------------
--
Andy Jacobson
andy.jacobson at noaa.gov
Program in Atmospheric and Oceanic Sciences
Sayre Hall, Forrestal Campus
Princeton University
PO Box CN710 Princeton, NJ 08544-0710 USA
Tel: 609/258-5260 Fax: 609/258-2850
More information about the OSX
mailing list