wok diff tramys/stuff/README @ rev 25062
Up foomatic-db-nonfree (20210824)
author | Pascal Bellard <pascal.bellard@slitaz.org> |
---|---|
date | Tue Jun 07 10:29:31 2022 +0000 (2022-06-07) |
parents | |
children |
line diff
1.1 --- /dev/null Thu Jan 01 00:00:00 1970 +0000 1.2 +++ b/tramys/stuff/README Tue Jun 07 10:29:31 2022 +0000 1.3 @@ -0,0 +1,306 @@ 1.4 +tramys - TRAnslate MY Slitaz 1.5 +Tool for managing translation files for SliTaz GNU/Linux 1.6 + 1.7 +Aleksei Bobylev <al.bobylev@gmail.com>, 2014 1.8 + 1.9 + 1.10 +Some random notes about tramys development. 1.11 + 1.12 + 1.13 +The idea 1.14 +======== 1.15 + 1.16 +I like to use applications translated to my language. But in other hand I like 1.17 +that fact SliTaz is not overloaded with unnecessary files. 1.18 + 1.19 +Some packages have their “twins”, which contains only translations. But note 1.20 +that all translations for the GIMP takes about 30 MB! And for Wesnoth ~ 90 MB!!! 1.21 +I don't need ALL that translations. Really. I need only one. 1.22 + 1.23 +Now we have some language-packs in the SliTaz repository. These packs contains 1.24 +translations for several packages for several chosen locales. Not bad, but 1.25 +what to do if I need translations for other packages not listed here? 1.26 + 1.27 +We set up the ftp.slitaz.org. Then I thought it was a good idea to seek and take 1.28 +the files with translations you want. And what about automation? 1.29 + 1.30 +Translations we'll found in the “install” sub-folders. Something like that: 1.31 +ftp://cook.slitaz.org/nano/install/usr/share/locale/ru/LC_MESSAGES/nano.mo 1.32 +for “nano” package with “ru” locale. 1.33 + 1.34 +Also we can download the same file using possibility in the cooker: 1.35 +http://cook.slitaz.org/cooker.cgi?download=../wok/nano/install/usr/share/locale/ 1.36 +ru/LC_MESSAGES/nano.mo 1.37 + 1.38 + 1.39 +About gettext's locale search order 1.40 +=================================== 1.41 + 1.42 +I know it's easy as for a first glance. My locale is “ru_UA” (I speak Russian 1.43 +and live in Ukraine). Gettext searches for “ru_UA” translations first. When it 1.44 +not finds them, he throws the country from the locale and looking for a “ru” 1.45 +translations. 1.46 + 1.47 +And I know that full locale definition can contain encoding and variant as 1.48 +addition: 1.49 + language_COUNTRY.ENCODING@variant 1.50 +And all parts except language are optional. Can't figure it out reading official 1.51 +docs [[ https://www.gnu.org/software/gettext/manual/gettext.html ]] 1.52 +So, experimenting with next piece of code: 1.53 + 1.54 +VAR0=$(LC_ALL="LL_CC.EE@VV" strace -o /tmp/s -e trace=file gettext test test; \ 1.55 +grep -F '/usr/lib/locale/' /tmp/s | \ 1.56 +grep -v locale-archive | \ 1.57 +grep -F '/LC_IDENTIFICATION' | \ 1.58 +sed 's|.*/usr/lib/locale/\([^/]*\).*|\1|g' | \ 1.59 +tr '\n' ' ') 1.60 +echo "${VAR0#test}" 1.61 + 1.62 +Here using special non-existing value for LC_ALL variable with aim to see all 1.63 +search variants (gettext stops to search if it found one of variants). 1.64 +Try it with different LC_ALL such as "LL@VV" or "LL.EE", etc. 1.65 + 1.66 + 1.67 +About preferred languages 1.68 +========================= 1.69 + 1.70 +It is a good possibility I found in the Gettext docs, but not applied yet. 1.71 +When Gettext not found any matched translations, it shows untranslated 1.72 +(=English) messages. Some of us not learned English in the school; it can be 1.73 +German of French. So, I want to see translations to my native language, or 1.74 +(if they not exists), let's say, French translations. All it done with setting 1.75 +of LANGUAGE environment variable. 1.76 + 1.77 + 1.78 +About LC_ALL 1.79 +============ 1.80 + 1.81 +It is not correct to set this variable up in some cases. And if we want to more 1.82 +mimic Gettext's behavior, we need to check many locale environment variables in 1.83 +certain order. 1.84 + 1.85 + 1.86 +About traffic saving 1.87 +==================== 1.88 + 1.89 +Now tramys downloading all the localization files, again and again. 1.90 +It have no knowledge, is your existing localization file actual or outdated, 1.91 +it just re-download it. 1.92 + 1.93 +Actually at the moment we have no simple solution. All solutions touches both 1.94 +client- and server-side. 1.95 + 1.96 +1. Using GNU wget 1.97 +----------------- 1.98 + 1.99 +It have '-N' option: 1.100 + 1.101 + -N, --timestamping don't re-retrieve files unless newer than 1.102 + local. 1.103 + 1.104 +But: a) default SliTaz uses BusyBox's wget that have no '-N' option, and 1.105 +b) SliTaz HTTP server not returned info about file's timestamp: 1.106 + 1.107 +$ curl -I 'http://cook.slitaz.org/cooker.cgi?download=../wok/nano/install/usr/sh 1.108 +are/locale/ru/LC_MESSAGES/nano.mo' 1.109 +HTTP/1.1 200 OK 1.110 +Content-Type: application/octet-stream 1.111 +Content-Length: 55436 1.112 +Content-Disposition: attachment; filename=nano.mo 1.113 +Date: Wed, 06 Aug 2014 20:53:37 GMT 1.114 +Server: lighttpd (SliTaz GNU/Linux) 1.115 + 1.116 +Our FTP server returned "Last-Modified", but both wgets not working with it :( 1.117 + 1.118 +$ curl -I 'ftp://cook.slitaz.org/nano/install/usr/share/locale/ru/LC_MESSAGES/na 1.119 +no.mo' 1.120 +Last-Modified: Thu, 10 Apr 2014 20:34:37 GMT 1.121 +Content-Length: 55436 1.122 +Accept-ranges: bytes 1.123 + 1.124 +$ busybox wget -O /tmp/nano1.mo 'ftp://cook.slitaz.org/nano/install/usr/share/lo 1.125 +cale/ru/LC_MESSAGES/nano.mo' 1.126 +Connecting to cook.slitaz.org (37.187.4.13:21) 1.127 +nano1.mo 100% |*******************************| 55436 0:00:00 ETA 1.128 + 1.129 +$ ls -l /tmp/nano1.mo 1.130 +-rw-r--r-- 1 tux users 55436 Aug 6 22:01 /tmp/nano1.mo 1.131 + 1.132 +$ wget -O /tmp/nano2.mo 'ftp://cook.slitaz.org/nano/install/usr/share/locale/ru/ 1.133 +LC_MESSAGES/nano.mo' 1.134 + 1.135 +$ ls -l /tmp/nano2.mo 1.136 +-rw-r--r-- 1 tux users 55436 Aug 6 22:03 /tmp/nano2.mo 1.137 + 1.138 + 1.139 +2. Write new client-server infrastructure 1.140 +----------------------------------------- 1.141 + 1.142 +We can write script instead of using two-byte solution (-N) :D 1.143 +Using BusyBox's wget on client. 1.144 +Script logic is followed. Client sends request to server: filename and date. 1.145 +Server returns newer file or returns nothing... It need to establish only one 1.146 +connection per file. 1.147 + 1.148 + 1.149 +3. Using curl 1.150 +------------- 1.151 + 1.152 +What about curl? Yes, it works: 1.153 + 1.154 +$ curl -R -o /tmp/nano.mo 'ftp://cook.slitaz.org/nano/install/usr/share/locale/r 1.155 +u/LC_MESSAGES/nano.mo' 1.156 + % Total % Received % Xferd Average Speed Time Time Time Current 1.157 + Dload Upload Total Spent Left Speed 1.158 +100 55436 100 55436 0 0 30898 0 0:00:01 0:00:01 --:--:-- 32361 1.159 + 1.160 +$ ls -l /tmp/nano.mo 1.161 +-rw-r--r-- 1 tux users 55436 Apr 10 20:34 /tmp/nano.mo 1.162 + 1.163 +Also, curl can ask server for gzipped content for traffic saving, and 1.164 +transparently ungzip it for you: 1.165 + 1.166 +$ curl -h 1.167 + --compressed Request compressed response (using deflate or gzip) 1.168 + 1.169 +And wget can send any specified header to server: 1.170 + 1.171 +$ wget -h 1.172 + --header=STRING insert STRING among the headers. 1.173 + 1.174 + 1.175 +Small note about date 1.176 +===================== 1.177 + 1.178 +Do you remember server answer? 1.179 + 1.180 +Last-Modified: Thu, 10 Apr 2014 20:34:37 GMT 1.181 + 1.182 +We can get date of file in this format using next code: 1.183 + 1.184 +LC_ALL=C; date -Rur ./nano.mo 1.185 +Thu, 10 Apr 2014 20:34:37 UTC 1.186 + 1.187 +Only need to remove both “GMT” and “UTC” and now we can compare two strings 1.188 +that contains date: 1.189 +if [ "${SERVER_DATE% GMT}" != "${LOCAL_DATE% UTC}" ]; then ... 1.190 + 1.191 + 1.192 +About lists format 1.193 +================== 1.194 + 1.195 +Here three formats of localization: GNU gettext's mo-files, Qt's qm-files, 1.196 +and other techniques (not supported yet). 1.197 + 1.198 +Gettext is more standardized: most often translation file called as package 1.199 +name, and it uses hierarchical tree structure in the 1.200 +/usr/share/locale/<locale name>/LC_MESSAGES 1.201 +Most often, but not always. 1.202 + 1.203 +In other hand, Qt frequently uses one directory for all package's translations. 1.204 +Something like /usr/share/<package>/translations/<package>_<locale>.qm 1.205 +Not always too. 1.206 + 1.207 +We can save all filenames with full path into one archive like it done in the 1.208 +tazpkg file /var/lib/tazpkg/files.list.lzma and will get 1.4 MiB file (48 KiB in 1.209 +LZMA). But I prefer lists with special format, I think plain list contains too 1.210 +much redundant info, and in some cases its too hard to determine which is the 1.211 +locale part in the filename. 1.212 + 1.213 +So, let me describe lists format. 1.214 +Here are one or more lines for package that supports localization. In the each 1.215 +line here are four tab-delimited fields. First two are mandatory, and next two 1.216 +are optional. 1.217 + 1.218 +1: package name 1.219 +2: locale name 1.220 +3: name of file that contains translations 1.221 +4: full path to that file 1.222 + 1.223 +For “nano” package (30 lines): 1.224 +nano bg nano.mo /usr/share/locale/bg/LC_MESSAGES 1.225 +... 1.226 +nano zh_TW nano.mo /usr/share/locale/zh_TW/LC_MESSAGES 1.227 + 1.228 +Now let's use some rules to make list smaller. 1.229 + 1.230 +RULE. Use “%” as placeholder for locale name in the path: 1.231 + /usr/share/locale/%/LC_MESSAGES 1.232 +RULE. Combine several locales into one space-separated list: 1.233 + bg ca cs da de es eu fi fr ga gl hu id it ms nb nl nn pl pt_BR ro ru rw sr 1.234 + sv tr uk vi zh_CN zh_TW 1.235 +RULE. Remove “.mo” from the end of filenames: 1.236 + nano 1.237 +RULE. Remove filename completely if it equals to package name. 1.238 +RULE. Remove default path “/usr/share/locale/%/LC_MESSAGES” completely. 1.239 +RULE. We can avoid empty 3rd and/or 4th fields: 1.240 + empty 3: field1 tab field2 tab tab field4 1.241 + empty 4: field1 tab field2 tab field3 1.242 + empty3&4: field1 tab field2 1.243 + 1.244 +So, now rule for the “nano” package is very simple (one line): 1.245 +nano bg ca cs da de es eu fi fr ga gl hu id it ms nb nl nn pl pt_BR ro ru rw 1.246 +sr sv tr uk vi zh_CN zh_TW 1.247 + 1.248 +And few more rules to compress list more. 1.249 +RULE. Combine several mo-files into one space-separated field if they have 1.250 + identical list of locales. 1.251 + Package “gtk+” contains “gtk20 gtk20-properties” in the third field. 1.252 + Also we can combine few paths into one space separated list. 1.253 +RULE. Use shell-syntax constants to save few bytes more: 1.254 + US="/usr/share" 1.255 + LC="LC_MESSAGES" 1.256 + PY="/usr/lib/python2.7/site-packages" 1.257 + R="/usr/lib/R/library" 1.258 + RT="$R/translations/%/$LC" 1.259 + 1.260 +In some situations we have choice: 1.261 + 1.262 +lcdnurse es fr he nl pt_BR ru th tr zh_CN $US/$P/locale/%/$LC 1.263 +lcdnurse es fr nl pt_BR ru tr zh_CN wxstd $US/$P/locale/%/$LC 1.264 + 1.265 +or: 1.266 + 1.267 +lcdnurse he th $US/$P/locale/%/$LC 1.268 +lcdnurse es fr nl pt_BR ru tr zh_CN $P wxstd $US/$P/locale/%/$LC 1.269 + 1.270 +Both variants works, and no one is mistaken. Also, second variant is shorter by 1.271 +24 bytes :) 1.272 + 1.273 + 1.274 +Lists: to be or not... 1.275 +====================== 1.276 + 1.277 +While I developed tramys my lists slowly moves to be more and more outdated. 1.278 +Many new packages in the wok, many upgrades... It seems like tramys lists 1.279 +needs to released very frequently. And I can't write all the sophisticated 1.280 +rules to automate process. 1.281 + 1.282 +It sounds not bad if we'll attach localization info to the package! Like 1.283 +description file? No. The more files our filesystem contains — the slower it is. 1.284 +Better to attach it to the package receipt. In this case we not need first 1.285 +field. Something like: 1.286 + 1.287 +L10N="he th $US/$P/locale/%/$LC 1.288 +es fr nl pt_BR ru tr zh_CN $P wxstd $US/$P/locale/%/$LC" 1.289 + 1.290 +Off topic. I think it's better to place description to the package too: 1.291 +description() 1.292 +{ 1.293 + cat << EOT 1.294 +Description here 1.295 +EOT 1.296 +} 1.297 + 1.298 +And for compatibility: read info both from receipt (if any) and from lists. 1.299 + 1.300 + 1.301 +TODO 1.302 +==== 1.303 + 1.304 +- Remove all translation files from all existing packages. 1.305 +- Migrate lists to receipts. 1.306 +- To support preferred languages in the LANGUAGE variable. 1.307 +- Write server-side script to get only changed/newer translation files. 1.308 +- Add tazpkg hook to get translations after package install (if user wants). 1.309 +- ...