Thursday, February 15, 2007

Let file(1) recognize a chemical MIME type - the next level

As written earlier, the file command may detect chemical MIME types, if you feed its database with the necessary definition rules. Now check out the cmd.magic.mime file and run the file command with

$ file -m cmd.magic.mime -i your_test_file.ext

for one or more chemical files. The detection is limited to chemical MIME types, that have magic pattern in the chemical-mime-data database.

The file cmd.magic.mime was created via XSLT conversion of the original chemical-mime-data database and can be used for KDE and file(1). One of the following package releases (probably 0.1.95) will provide it.

Update

Unfortunately this magic.mime file cannot be used for KDE. As pointed out by David Faure, KDE uses an older syntax that doesn't know e.g. the search type. So I have to find another solution for KDE or wait for better days.

Tuesday, February 13, 2007

Let file(1) recognize a chemical MIME type

Ok, it's 8 o'clock in the morning and I need some sleep. But first let's show you some small example, how to detect chemical MIME types with the file(1) command. Put the following stuff into your /etc/magic (the local magic data configuration file for the file(1) command):

0               string          VjCD0100                CDX binary file
>8              belong          0x04030201
>>12            bequad          0x0000000000000000
>>>20           beshort         0x0000
>>>>34          string          ChemDraw                written with ChemDraw
>>>>>42         string          x                       b%.4s
>>>>22          beshort         0x0080                  (new format)
>>>>22          beshort         0x0000                  (old format)

Now search for a CDX (ChemDraw binary) file and run the file command:

$ find . -name "*.cdx" -exec file "{}" ";"
./example.cdx: data (that's really broken)
./dummy.cdx: lif file (that's a CACTVS file in reality)
./x-chemdraw/structures25-27.cdx: CDX binary file written with ChemDraw 4.5 (old format)
./x-chemdraw/structures96-101.cdx: CDX binary file written with ChemDraw 4.5 (old format)
./x-chemdraw/structures40-48.cdx: CDX binary file written with ChemDraw 4.5 (old format)
./x-chemdraw/dimethylamine.cdx: CDX binary file written with ChemDraw 7.0 (old format)
./x-chemdraw/dimethylaminesimple.cdx: CDX binary file (old format)
./x-chemdraw/untitled.cdx: CDX binary file written with ChemDraw 8.0 (new format)
./x-chemdraw/structures1-12.cdx: CDX binary file written with ChemDraw 4.5 (old format)

Why I do this? The chemical-mime-data package contains magic pattern, that can be used to automatically create the rules for the file command too, so file(1) can determine the chemical MIME type too. Expect a stylesheet to extract this information from the database soon.

And now I will get some sleep.

Update

And here, how to recognize the MIME type. Add the following to /etc/magic.mime:

0               string          VjCD0100
>8              belong          0x04030201
>>12            bequad          0x0000000000000000
>>>20           beshort         0x0000
>>>>22          beshort         0x0080                  chemical/x-cdx
>>>>22          beshort         0x0000                  chemical/x-cdx

and run the file(1) command with the -i switch:

$ find . -name "*.cdx" -exec file -i "{}" ";"
./example.cdx: application/octet-stream
./dummy.cdx: application/octet-stream
./x-chemdraw/structures25-27.cdx: chemical/x-cdx
./x-chemdraw/structures96-101.cdx: chemical/x-cdx
./x-chemdraw/structures40-48.cdx: chemical/x-cdx
./x-chemdraw/dimethylamine.cdx: chemical/x-cdx
./x-chemdraw/dimethylaminesimple.cdx: chemical/x-cdx
./x-chemdraw/untitled.cdx: chemical/x-cdx
./x-chemdraw/structures1-12.cdx: chemical/x-cdx

And now I really get some sleep. Cheerio!

Monday, February 5, 2007

chemical-mime-data 0.1.94 released

Today I released a new version of the chemical-mime-data package, namely 0.1.94. This version adds and improves support for various chemical MIME types (see below). It fixes several build issues and improves some detection and build stuff. The RH bug #225095 (SF.net bug #1616568) has been fixed. The TODO list is now also a little bit shorter, because several items (source and package documentation) have been done with this release too.

This version adds support for:

  • chemical/x-cactvs-ascii
  • chemical/x-cactvs-binary
  • chemical/x-cactvs-table
  • chemical/x-cdxml
  • chemical/x-gamess-output
  • chemical/x-gulp
  • chemical/x-ncbi-asn1
  • chemical/x-ncbi-asn1-binary
  • chemical/x-ncbi-asn1-xml

Support has been improved for:

  • chemical/x-cdx
  • chemical/x-cml
  • chemical/x-cif
  • chemical/x-dmol
  • chemical/x-gamess-input
  • chemical/x-gaussian-input
  • chemical/x-gaussian-log
  • chemical/x-genbank
  • chemical/x-hin
  • chemical/x-inchi
  • chemical/x-inchi-xml
  • chemical/x-mdl-rxnfile
  • chemical/x-mmcif
  • chemical/x-mol2
  • chemical/x-msi-car
  • chemical/x-msi-hessian
  • chemical/x-msi-mdf
  • chemical/x-msi-msi
  • chemical/x-pdb
  • chemical/x-shelx

The full release changelog can found at the SF.net project site.

An updated Debian package will be available soon in the experimental tree (not in sid because of the release freeze).

Saturday, February 3, 2007

Ubuntu packaging support stopped / Ubuntu-Unterstützung gestoppt

English: Ubuntu support will be stopped and the repository for packages built for Ubuntu removed.

Deutsch: Ich möchte Ubuntu nicht länger unterstützen und habe alle Pakete für Ubuntu aus dem Repository entfernt.