Showing posts with label html. Show all posts
Showing posts with label html. Show all posts

12 May 2014

Generating wikipedia semantic links from a pubmed-id

In "Building a biomedical semantic network in Wikipedia with Semantic Wiki Links" (Database . 2012 Mar 20;2012) Benjamin Good & al. introduced the Semantic Wiki Link (SWL):

An SWL is a hyperlink on Wikipedia that allows the editor to explicitly specify the type of relationship between the concept described on the page being edited and the concept that is being linked to (http://en.wikipedia.org/wiki/Template:SWL). These SWLs are implemented using MediaWiki templates.
(...)
any programmer can now write computer programs to parse Wikipedia content for SWLs and import them into third-party tools (e.g. triplestores, etc.)
Example: Phospholamban:
The protein encoded by this gene is found as a pentamer and is a major substrate for the cAMP-dependent protein kinase ({{SWL|type=substrate_for|target=protein kinase A|label=PKA}}) in cardiac muscle.




Using Entrez-Ajax (Loman & al.) and the Wikipedia API, I wrote a HTML+JS interface to accelerate the creation of a semantic SWL wiki-text from a PUBMED-id:


and.. well, that's it,

Pierre

09 June 2013

How to fit a sentence in a rectangle with the Hershey vectorial font.

via wikipedia: The Hershey fonts are a collection of vector fonts developed circa 1967 by Dr. A. V. Hershey (...). Vector fonts are easily scaled and rotated in two or three dimensions; consequently the Hershey fonts have been widely used in computer graphics and computer-aided design programs.. When programming, I often have to fit a sentence in a rectangle (for example to write the name of a short-read in the graphical view of a BAM) so I wrote a XML version of the hershey font.

<?xml version="1.0"?>
<hershey>
  <letter id="1" count="9" left="-5" right="5" char="a">
    <moveto x="0" y="-5"/>
    <lineto x="-4" y="4"/>
    <moveto x="0" y="-5"/>
    <lineto x="4" y="4"/>
    <moveto x="-2" y="1"/>
    <lineto x="2" y="1"/>
  </letter>
  <letter id="2" count="16" left="-5" right="5" char="b">
    <moveto x="-3" y="-5"/>
    <lineto x="-3" y="4"/>
    <moveto x="-3" y="-5"/>
    <lineto x="1" y="-5"/>
    <lineto x="3" y="-4"/>
    <lineto x="3" y="-2"/>
    <lineto x="1" y="-1"/>
    <moveto x="-3" y="-1"/>
    <lineto x="1" y="-1"/>
    <lineto x="3" y="0"/>
    <lineto x="3" y="3"/>
From there, I can generate some bindings
for various programming languages using XSLT, for example javascript:
{
 "1":[{t:'M',x:0,y:-5},{t:'L',x:-4,y:4},{t:'M',x:0,y:-5},{t:'L',x:4,y:4},{t:'M',x:-2,y:1},{t:'L',x:2,y:1}],
 "2":[{t:'M',x:-3,y:-5},{t:'L',x:-3,y:4},{t:'M',x:-3,y:-5},{t:'L',x:1,y:-5},{t:'L',x:3,y:-4},{t:'L',x:3,y:-2},{t:'L',x:1,y:-1},{t:'M',x:-3,y:-1},{t:'L',x:1,y:-1},{t:'L',x:3,y:0},{t:'L',x:3,y:3},{t:'L',x:1,y:4},{t:'L',x:-3,y:4}], ...

In the Javascript example below, I'm generating some random rectangles where a sentence is written:


That's it,
Pierre

22 May 2013

Drawing a Timeline with jquery.

In 2008, I wrote a XUL-based interface displaying a timeline (http://...freebase-and-history-of-sciences.html).
History of Sciences / Freebase
Here I've played with jquery to display another timeline:

html

There is no json data: everyting is stored in the HTML. The years are surrounded by a <span/> element having a css class "start/end".


javascript

We use jquery to sort and layout each event.

CSS

A basic CSS for my timeline

Result


It's far from being perfect. I don't know how to handle the images overflowing the "div".

That's it,

Pierre


05 February 2013

Making use of Picard Metrics files using XML and XSLT. #ngs

Many tools in the Picard package produce some "Metrics File" (described at http://picard.sourceforge.net/picard-metric-definitions.shtml). The picard API contains a java parser "MetricsFile" parsing those metrics-file:

MetricsFile<MetricBase, Comparable<?>> metricsFile=new MetricsFile<MetricBase, Comparable<?>>();
metricsFile.read(new FileReader("metrics.txt"));
In order produce some custom reports from those files, I've created a tool that dump the content of the MetricsFile as a XML file. The source code is available at: http://code.google.com/p/jvarkit/source/browse/trunk/src/main/java/fr/inserm/umr1087/jvarkit/tools/picard/metrics2xml/PicardMetricsToXML.java.

Compilation

$ mkdir tmp
$ javac -d tmp -cp  /path/to/picard.jar:/path/to/sam.jar \
     -sourcepath  src/main/java \
     src/main/java/fr/inserm/umr1087/jvarkit/tools/picard/metrics2xml/PicardMetricsToXML.java
$ jar vcf picardmetrics2xml.jar -C tmp .

Usage

Say you have used the tool 'CollectInsertSizeMetrics.jar' from picard:
$ java -jar/path/to/CollectInsertSizeMetrics.jar \
 O=out.metrics \
 I=/path/to/samtools/examples/sorted.bam \
 AS=true \
 R=/path/to/samtools/ex1.fa \
 H=chart.pdf
The file out.metrics looks like this:
## net.sf.picard.metrics.StringHeader
# net.sf.picard.analysis.CollectInsertSizeMetrics HISTOGRAM_FILE=(...)
## net.sf.picard.metrics.StringHeader
# Started on: Tue Feb 05 12:51:30 CET 2013

## METRICS CLASS net.sf.picard.analysis.InsertSizeMetrics
MEDIAN_INSERT_SIZE MEDIAN_ABSOLUTE_DEVIATION MIN_INSERT_SIZE MAX_INSERT_SIZE MEAN_INSERT_SIZE STANDARD_DEVIATION READ_PAIRS
209 10 54 243 208.857506 13.614603 4716 FR 5 9 13 17 21 25 29 35 43 

## HISTOGRAM java.lang.Integer
insert_size All_Reads.fr_count
54 3
170 3
173 9
174 3
175 3
177 6
(...)
This file can be converted to XML using the following command:
$ java -cp /path/to/picard.jar:/path/to/sam.jar:picardmetrics2xml.jar file.metrics


<?xml version="1.0" encoding="UTF-8"?><picard-metrics xmlns="http://picard.sourc
eforge.net/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><metrics-file
 file="file.metrics"><headers><header class="net.sf.picard.metrics.StringHeader"
>net.sf.picard.analysis.CollectInsertSizeMetrics HISTOGRAM_FILE=jeter2 INPUT=/ho
me/lindenb/package/samtools-0.1.18/examples/sorted.bam OUTPUT=jeter REFERENCE_SE
QUENCE=/home/lindenb/package/samtools-0.1.18/examples/ex1.fa ASSUME_SORTED=true 
   DEVIATIONS=10.0 MINIMUM_PCT=0.05 METRIC_ACCUMULATION_LEVEL=[ALL_READS] STOP_A
FTER=0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL
=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false</header><h
eader class="net.sf.picard.metrics.StringHeader">Started on: Tue Feb 05 12:51:30
 CET 2013</header></headers><metrics><thead class="net.sf.picard.analysis.Insert
SizeMetrics"><th class="double">MEDIAN_INSERT_SIZE</th><th class="double">MEDIAN
_ABSOLUTE_DEVIATION</th><th class="int">MIN_INSERT_SIZE</th><th class="int">MAX_
INSERT_SIZE</th><th class="double">MEAN_INSERT_SIZE</th><th class="double">STAND
ARD_DEVIATION</th><th class="long">READ_PAIRS</th><th class="net.sf.picard.sam.S
amPairUtil$PairOrientation">PAIR_ORIENTATION</th><th class="int">WIDTH_OF_10_PER
CENT</th><th class="int">WIDTH_OF_20_PERCENT</th><th class="int">WIDTH_OF_30_PER
CENT</th><th class="int">WIDTH_OF_40_PERCENT</th><th class="int">WIDTH_OF_50_PER
CENT</th><th class="int">WIDTH_OF_60_PERCENT</th><th class="int">WIDTH_OF_70_PER
CENT</th><th class="int">WIDTH_OF_80_PERCENT</th><th class="int">WIDTH_OF_90_PER
CENT</th><th class="int">WIDTH_OF_99_PERCENT</th><th class="java.lang.String">SA
MPLE</th><th class="java.lang.String">LIBRARY</th><th class="java.lang.String">R
EAD_GROUP</th></thead><tbody><tr><td>209.0</td><td>10.0</td><td>54</td><td>243</
td><td>208.857506</td><td>13.614603</td><td>4716</td><td>FR</td><td>5</td><td>9<
/td><td>13</td><td>17</td><td>21</td><td>25</td><td>29</td><td>35</td><td>43</td
><td>65</td><td xsi:nil="true"/><td xsi:nil="true"/><td xsi:nil="true"/></tr></t
body></metrics><histogram class="java.lang.Integer"><thead><th>insert_size</th><
th>All_Reads.fr_count</th></thead><tbody><tr><td>54</td><td>3.0</td></tr><tr><td
>170</td><td>3.0</td></tr><tr><td>173</td><td>9.0</td></tr><tr><td>174</td><td>3
.0</td></tr><tr><td>175</td><td>3.0</td></tr><tr><td>177</td><td>6.0</td></tr><t
r><td>178</td><td>6.0</td></tr><tr><td>179</td><td>9.0</td></tr><tr><td>180</td>
<td>6.0</td></tr><tr><td>181</td><td>6.0</td></tr><tr><td>182</td><td>21.0</td><
/tr><tr><td>183</td><td>9.0</td></tr><tr><td>184</td><td>15.0</td></tr><tr><td>1
85</td><td>33.0</td></tr><tr><td>186</td><td>15.0</td></tr><tr><td>187</td><td>3
(...)

Converting to JSON

Now, we can convert the XML to whatever we want using XSLT. I wrote a stylesheet picardmetrics2json.xsl converting the XML to JSON (though, I should escape the quotes in the strings ).
$ xsltproc picardmetrics2json.xsl metrics.xml


{
    "metrics.xml": {
        "headers": [
            {
                "class": "net.sf.picard.metrics.StringHeader",
                "value": "net.sf.picard.analysis.CollectInsertSizeMetrics HISTOGRAM_FILE=metrics.pdf INPUT=samtools-0.1.18/examples/sorted.bam OUTPUT=metrics.txt REFERENCE_SEQUENCE=/home/lindenb/package/samtools-0.1.18/examples/ex1.fa ASSUME_SORTED=true    DEVIATIONS=10.0 MINIMUM_PCT=0.05 METRIC_ACCUMULATION_LEVEL=[ALL_READS] STOP_AFTER=0 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false"
            },
            {
                "class": "net.sf.picard.metrics.StringHeader",
                "value": "Started on: Tue Feb 05 12:51:30 CET 2013"
            }
        ],
        "metrics": [
            {
                "MEDIAN_INSERT_SIZE": 209,
                "MEDIAN_ABSOLUTE_DEVIATION": 10,
                "MIN_INSERT_SIZE": 54,
                "MAX_INSERT_SIZE": 243,
                "MEAN_INSERT_SIZE": 208.857506,
                "STANDARD_DEVIATION": 13.614603,
                "READ_PAIRS": 4716,
                "PAIR_ORIENTATION": "FR",
                "WIDTH_OF_10_PERCENT": 5,
                "WIDTH_OF_20_PERCENT": 9,
                "WIDTH_OF_30_PERCENT": 13,
                "WIDTH_OF_40_PERCENT": 17,
                "WIDTH_OF_50_PERCENT": 21,
                "WIDTH_OF_60_PERCENT": 25,
                "WIDTH_OF_70_PERCENT": 29,
                "WIDTH_OF_80_PERCENT": 35,
                "WIDTH_OF_90_PERCENT": 43,
                "WIDTH_OF_99_PERCENT": 65,
                "SAMPLE": null,
                "LIBRARY": null,
                "READ_GROUP": null
            }
        ],
        "histogram": [
            {
                "insert_size": 54,
                "All_Reads.fr_count": 3
            },
            {
                "insert_size": 170,
                "All_Reads.fr_count": 3
            },(...)

Converting to HTML

Another stylesheet convert the XML to HTML. It also produces the javascript code to display the histograms using Google chart:
$ xsltproc picardmetrics2html.xsl metrics.xml > output.html


That's it,
Pierre

07 January 2012

A CGI-version of samtools tview.

I've created a lightweight CGI-based web-application for samtools tview. This C++ program named ngsproject.cgi uses the samtools api, it allows any user to visualize all the alignments in a given NGS project. The projects and their BAMS are defined on the server side using a simple XML document. e.g:

<?xml version="1.0"?>
<projects>
 <reference id="hg19">
  <path>/home/lindenb/samtools-0.1.18/examples/ex1.fa</path>
 </reference>
 <bam id="b1">
  <sample>Sample 1</sample>
  <path>/home/lindenb/samtools-0.1.18/examples/ex1.bam</path>
 </bam>
 <bam id="b2">
  <sample>Sample 2</sample>
  <path>/home/lindenb/samtools-0.1.18/examples/ex1.bam</path>
 </bam>
 <project id="1">
  <name>Test 1</name>
  <description>Test</description>
  <bam ref="b1"/>
  <bam ref="b2"/>
  <reference ref="hg19" />
 </project>
 <project id="2">
  <name>Test 2</name>
  <description>Test</description>
  <bam ref="b2"/>
  <reference ref="hg19" />
 </project>
</projects>

Once the CGI has been installed, the user can visualize the reads of each samples.

This tool is available in the variation toolkit at http://code.google.com/p/variationtoolkit/.

That's it.

Pierre

01 August 2011

Using the Freebase and the Bioportal Widgets to create a semantic object.


The following HTML code uses:
After completion, it generates a JSON object describing a semantic object:
{
"subject":"http://www.freebase.com/view/en/nsp3",
"predicate":"http://purl.org/obo/owl/MI#MI_0407",
"value":"http://www.freebase.com/view/en/roxan",
"pmid":15047801
}



Source


You can download the source and test it:


That's it,

Pierre

22 April 2011

Playing with the HTML5 File API: translating a Fasta file.

In the current post, I'm using the new HTML5 File Api. This new API can read the content of a file on the client side without needing a remote server. Let me repeat this:

YOU DO NOT NEED A SERVER
YOU DO NOT NEED TO COPY AND PASTE THE CONTENT OF THE FILE IN A TEXTAREA
.
As an example, the following code reads a whole DNA fasta file stored on your computer and translate each DNA sequence to a protein. When the user selects a new file, a FileReader object is created and a callback function translating the DNA is invoked when the fasta file has been loaded.

Test (your browser must support HTML5)

:

Source code



That's it,

Pierre

22 March 2011

Blast Stylesheet : XML to HTML

I wrote a XSLT stylesheet for the following question on Biostar: I'd like to create an HTML file (from the XML file and XSL stylesheet) similar to what It can be achieved when we performed a BLAST search on the NCBI server.

The stylesheet I wrote is available on github at: https://github.com/lindenb/xslt-sandbox/blob/master/stylesheets/bio/ncbi/blast2html.xsl. (see also my previous post blast2svg )

Usage:

xsltproc --novalid blast2html.xsl blast.xml > blast.html

Example:

Here is a XML output of blast:
<BlastOutput>
<BlastOutput_program>blastp</BlastOutput_program>
<BlastOutput_version>BLASTP 2.2.25+</BlastOutput_version>
<BlastOutput_reference>Alejandro A. Sch&auml;ffer, L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001), "Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements", Nucleic Acids Res. 29:2994-3005.</BlastOutput_reference>
<BlastOutput_db>N/A</BlastOutput_db>
<BlastOutput_query-ID>gi|187956781|gb|AAI40897.1|</BlastOutput_query-ID>
<BlastOutput_query-def>EIF4G1 protein [Homo sapiens]</BlastOutput_query-def>
<BlastOutput_query-len>1606</BlastOutput_query-len>
<BlastOutput_param>
<Parameters>
<Parameters_matrix>BLOSUM62</Parameters_matrix>
<Parameters_expect>10</Parameters_expect>
<Parameters_gap-open>11</Parameters_gap-open>
<Parameters_gap-extend>1</Parameters_gap-extend>
<Parameters_filter>F</Parameters_filter>
</Parameters>
</BlastOutput_param>
<BlastOutput_iterations>
<Iteration>
<Iteration_iter-num>1</Iteration_iter-num>
<Iteration_query-ID>gi|187956781|gb|AAI40897.1|</Iteration_query-ID>
<Iteration_query-def>EIF4G1 protein [Homo sapiens]</Iteration_query-def>
<Iteration_query-len>1606</Iteration_query-len>
<Iteration_hits>
<Hit>
<Hit_num>1</Hit_num>
<Hit_id>gi|293340930|ref|XP_002724789.1|</Hit_id>
<Hit_def>PREDICTED: eukaryotic translation initiation factor 4 gamma, 1 isoform 2 [Rattus norvegicus] >gi|293352298|ref|XP_002727969.1| PREDICTED: eukaryotic translation initiation factor 4, gamma 1 isoform 1 [Rattus norvegicus]</Hit_def>
<Hit_accession>XP_002727969</Hit_accession>
<Hit_len>1584</Hit_len>
<Hit_hsps>
<Hsp>
<Hsp_num>1</Hsp_num>
<Hsp_bit-score>2715.64</Hsp_bit-score>
<Hsp_score>7038</Hsp_score>
<Hsp_evalue>0</Hsp_evalue>
<Hsp_query-from>1</Hsp_query-from>
<Hsp_query-to>1606</Hsp_query-to>
<Hsp_hit-from>1</Hsp_hit-from>
<Hsp_hit-to>1584</Hsp_hit-to>
<Hsp_query-frame>0</Hsp_query-frame>
<Hsp_hit-frame>0</Hsp_hit-frame>
<Hsp_identity>1450</Hsp_identity>
<Hsp_positive>1450</Hsp_positive>
<Hsp_gaps>36</Hsp_gaps>
<Hsp_align-len>1613</Hsp_align-len>
<Hsp_qseq>MNKAPQSTGPPPAPSPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQGGFRSLQHFYPSRAQPPSSAASRVQSAAPARPGPAAHVYPAGSQVMMIPSQISYPASQGAYYIPGQGRSTYVVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQGVQQFPTGVAPAPVLMNQPPQIAPKRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGGGLEPQANGETPQVAVIVRPDDRSQGAIIADRPGLPGPEHSP-SESQPSSPSPTPSPSPVLEPGSEPNLAVLSIPGDTMTT--IQMSVEESTPISRETGEPYRLSPEPTPLAEPILEVEVTLSKPVPESEFSSSPLQAPTPLASHTVEIHEPNGMVPSEDLEPEVESSPELAPPP--ACPSESPVPIAPTAQPEELLNGAPSPPAVDLSPVSEPEEQAKEV-TASMAPPTIPSATPATAPSATSPAQEEEMEEEEEEEEGEAGEAGEAESEKGGEELLPPESTPIPANLSQNLEAAAATQVAVSVPKRRRKIKELNKKEAVGDLLDAFKEANPAVPEVENQPPAGSNPGPESEGSGVPPRPEEADETWDSKEDKIHNAENIQPGEQKYEYKSDQWKPLNLEEKKRYDREFLLGFQFIFASMQKPEGLPHISDVVLDKANKTPLRPLDPTRLQGINCGPDFTPSFANLGRTTLSTRGPPRGGPGGELPRGPAGLGPRRSQQGPRKEPRKIIATVLMTEDIKLNKAEKAWKPSSKRTAADKDRGEEDADGSKTQDLFRRVRSILNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCLMALKVPTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLKEELEEARDIARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLLKNHDEESLECLCRLLTTIGKDLDFEKAKPRMDQYFNQMEKIIKEKKTSSRIRFMLQDVLDLRGSNWVPRRGDQGPKTIDQIHKEAEMEEHREHIKVQQLMAKGSDKRRGGPPGPPISRGLPLVDDGGWNTVPISKGSRPIDTSRLTKITKPGSIDSNNQLFAPGGRLSWGKGSSGGSGAKPSDAASEAARPATSTLNRFSALQQAVPTESTDNRRVVQRSSLSRERGEKAGDRGDRLERSERGGDRGDRLDRARTPATKRSFSKEVEERSRERPSQPEGLRKAASLTEDRDRGRDAVKREAALPPVSPLKAALSEEELEKKSKAIIEEYLHLNDMKEAVQCVQELASPSLLFIFVRHGVESTLERSAIAREHMGQLLHQLLCAGHLSTAQYYQGLYEILELAEDMEIDIPHVWLYLAELVTPILQEGGVPMGELFREITKPLRPLGKAASLLLEILGLLCKSMGPKKVGTLWREAGLSWKEFLPEGQDIGAFVAEQKVEYTLGEESEAPGQRALPSEELNRQLEKLLKEGSSNQRVFDWIEANLSEQQIVSNTLVRALMTAVCYSAIIFETPLRVDVAVLKARAKLLQKYLCDEQKELQALYALQALVVTLEQPPNLLRMFFDALYDEDVVKEDAFYSWESSKDPAEQQGKGVALKSVTAFFKWLREAE-EESDHN</Hsp_qseq>
<Hsp_hseq>MNKAPQPTGPPPARSPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQ-------HFYPSRAQPPSSAASRVQSAAPARPGPAPHVYPAGSQVMMIPSQISYSASQGAYYIPGQGRSTYVVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQSVQQFPASVAPAPVLMNQPPQIAPKRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGGSLEPQPNGESPQVAVIIRPDDRSQGAAIGGRPGLPGPEHSPGTESQPSSPSPTPSPPPILEPGSESNLGVLSIPGDTMTTGMIPISVEESTPISCESGEPYCLSPEPT-LAEPILEVEVTLSKPIPESEFSSSPLQVSTSLVPHRAETHEPNGVIPSEDLEPEVESSTEPAPPPLSACASESLVPIAPTAQPEELLNGAPSPPAVDLSPVSEPEEQAKEVPSAALA--SIVSPTPPVAPSDTSAAQEEEIEED-------EDEDGEAESEKGGEDL-PLDSTPVPAQLSQNLEVAAAPQVAVSVPKRRRKIKELNKKEAVGDLLDAFKEVDPAVPEVENQPPTGSNPSPESEGSAALPQPEEAEETWDSKEDKIHNAENIQPGEQKYEYKSDQWKPLNLEEKKRYDREFLLGFQFIFASMQKPEGLPHITDVVLDKANKTPLRSLDPSRLPGINCGPDFTPSFANLGRPTLSSRGPPRGGPGGELPRGPAGLGPRRSQQGPRKETRKIISSVIMTEDIKLNKAEKAWKPSSKRTAADKDRGEEDADGSKTQDLFRRVRSILNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCLMALKVPTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLKEELEEARDIARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLLKNHDEESLECLCRLLTTIGKDLDFAKAKPRMDQYFNQMEKIIKEKKTSSRIRFMLQDVLDLRQSNWVPRRGDQGPKTIDQIHKEAEMEEHREHIKVQQLMAKGGDKRRGGPPGPP-------VNDGGWNTVPISKGSRPIDTSRLTKITKPGSIDSNNQLFAPGGRLSWGKGSSGGSGAKPSDTASEATRPA--TLNRFSALQQTLPVENTDNRRVVQRSSLSRERGEKAGDRGDRLERSERGGDRGDRLDRARTPATKRSFSKEVEERSRERPSQPEGLRKAASLTE--DRGRDPVKREATLPPVSPPKAALAVDEVERKSKAIIEEYLHLNDMKEAVQCVQELASPSLLFIFVRLGIESTLERSTIAREHMGRLLHQLLCAGHLSTAQYYQGLYETLELAEDMEIDIPHVWLYLAELITPILQEDGVPMGELFREITKPLRPMGKATSLLLEILGLLCKSMGPKKVGMLWREAGLSWREFLAEGQDVGSFVAEKKVEYTLGEESEAPGQRALAFEELRRQLEKLLKDGGSNQRVFDWIEANLNEQQIASNTLVRALMTTVCYSAIIFETPLRVDVQVLKVRARLLQKYLSDEQKELQALYALQALVVTLEQPANLLRMFFDALYDEDVVKEDAFYSWESSKDPAEQQGKGVALKSVTAFFNWLREAEDEESDHN</Hsp_hseq>
<Hsp_midline>MNKAPQ TGPPPA SPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQ HFYPSRAQPPSSAASRVQSAAPARPGPA HVYPAGSQVMMIPSQISY ASQGAYYIPGQGRSTYVVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQ VQQFP VAPAPVLMNQPPQIAPKRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGG LEPQ NGE PQVAVI RPDDRSQGA I RPGLPGPEHSP ESQPSSPSPTPSP P LEPGSE NL VLSIPGDTMTT I SVEESTPIS E GEPY LSPEPT LAEPILEVEVTLSKP PESEFSSSPLQ T L H E HEPNG PSEDLEPEVESS E APPP AC SES VPIAPTAQPEELLNGAPSPPAVDLSPVSEPEEQAKEV A A I S TP APS TS AQEEE EE E GEAESEKGGE L P STP PA LSQNLE AAA QVAVSVPKRRRKIKELNKKEAVGDLLDAFKE PAVPEVENQPP GSNP PESEGS P PEEA ETWDSKEDKIHNAENIQPGEQKYEYKSDQWKPLNLEEKKRYDREFLLGFQFIFASMQKPEGLPHI DVVLDKANKTPLR LDP RL GINCGPDFTPSFANLGR TLS RGPPRGGPGGELPRGPAGLGPRRSQQGPRKE RKII V MTEDIKLNKAEKAWKPSSKRTAADKDRGEEDADGSKTQDLFRRVRSILNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCLMALKVPTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLKEELEEARDIARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLLKNHDEESLECLCRLLTTIGKDLDF KAKPRMDQYFNQMEKIIKEKKTSSRIRFMLQDVLDLR SNWVPRRGDQGPKTIDQIHKEAEMEEHREHIKVQQLMAKG DKRRGGPPGPP V DGGWNTVPISKGSRPIDTSRLTKITKPGSIDSNNQLFAPGGRLSWGKGSSGGSGAKPSD ASEA RPA TLNRFSALQQ P E TDNRRVVQRSSLSRERGEKAGDRGDRLERSERGGDRGDRLDRARTPATKRSFSKEVEERSRERPSQPEGLRKAASLTE DRGRD VKREA LPPVSP KAAL E E KSKAIIEEYLHLNDMKEAVQCVQELASPSLLFIFVR G ESTLERS IAREHMG LLHQLLCAGHLSTAQYYQGLYE LELAEDMEIDIPHVWLYLAEL TPILQE GVPMGELFREITKPLRP GKA SLLLEILGLLCKSMGPKKVG LWREAGLSW EFL EGQD G FVAE KVEYTLGEESEAPGQRAL EEL RQLEKLLK G SNQRVFDWIEANL EQQI SNTLVRALMT VCYSAIIFETPLRVDV VLK RA LLQKYL DEQKELQALYALQALVVTLEQP NLLRMFFDALYDEDVVKEDAFYSWESSKDPAEQQGKGVALKSVTAFF WLREAE EESDHN</Hsp_midline>
</Hsp>
</Hit_hsps>
</Hit>
</Iteration_hits>
<Iteration_stat>
<Statistics>
<Statistics_db-num>0</Statistics_db-num>
<Statistics_db-len>0</Statistics_db-len>
<Statistics_hsp-len>0</Statistics_hsp-len>
<Statistics_eff-space>0</Statistics_eff-space>
<Statistics_kappa>-1</Statistics_kappa>
<Statistics_lambda>-1</Statistics_lambda>
<Statistics_entropy>-1</Statistics_entropy>
</Statistics>
</Iteration_stat>
</Iteration>
</BlastOutput_iterations>
</BlastOutput>

After processing:

(...)

Descriptions

AccessionDefe-value
XP_002727969PREDICTED: eukaryotic translation initiation factor 4 gamma, 1 isoform 2 [Rattus norvegicus] >gi|293352298|ref|XP_002727969.1| PREDICTED: eukaryotic translation initiation factor 4, gamma 1 isoform 1 [Rattus norvegicus]0
(...)

Alignments

>gi|293340930|ref|XP_002724789.1||XP_002727969|PREDICTED: eukaryotic translation initiation factor 4 gamma, 1 isoform 2 [Rattus norvegicus] >gi|293352298|ref|XP_002727969.1| PREDICTED: eukaryotic translation initiation factor 4, gamma 1 isoform 1 [Rattus norvegicus]
Length=1584
Score = 2715.64 bits (7038), Expect = 0
Identities = 1450/1613 (89.8946063236206%), Gaps = 36/1613 (2.231866088034718%)
Strand = Plus/Plus

Query 1 MNKAPQSTGPPPAPSPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQGGFRSLQHFYP 60
MNKAPQ TGPPPA SPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQ HFYP
Sbjct 1 MNKAPQPTGPPPARSPGLPQPAFPPGQTAPVVFSTPQATQMNTPSQPRQ-------HFYP 53

Query 61 SRAQPPSSAASRVQSAAPARPGPAAHVYPAGSQVMMIPSQISYPASQGAYYIPGQGRSTY 120
SRAQPPSSAASRVQSAAPARPGPA HVYPAGSQVMMIPSQISY ASQGAYYIPGQGRSTY
Sbjct 54 SRAQPPSSAASRVQSAAPARPGPAPHVYPAGSQVMMIPSQISYSASQGAYYIPGQGRSTY 113

Query 121 VVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQGVQQFPTGVAPAPVLMNQPPQIAP 180
VVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQ VQQFP VAPAPVLMNQPPQIAP
Sbjct 114 VVPTQQYPVQPGAPGFYPGASPTEFGTYAGAYYPAQSVQQFPASVAPAPVLMNQPPQIAP 173

Query 181 KRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGGGLEPQANGETPQVAVIVRPD 240
KRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGG LEPQ NGE PQVAVI RPD
Sbjct 174 KRERKTIRIRDPNQGGKDITEEIMSGARTASTPTPPQTGGSLEPQPNGESPQVAVIIRPD 233

Query 241 DRSQGAIIADRPGLPGPEHSP-SESQPSSPSPTPSPSPVLEPGSEPNLAVLSIPGDTMTT 299
DRSQGA I RPGLPGPEHSP ESQPSSPSPTPSP P LEPGSE NL VLSIPGDTMTT
Sbjct 234 DRSQGAAIGGRPGLPGPEHSPGTESQPSSPSPTPSPPPILEPGSESNLGVLSIPGDTMTT 293

Query 300 --IQMSVEESTPISRETGEPYRLSPEPTPLAEPILEVEVTLSKPVPESEFSSSPLQAPTP 357
I SVEESTPIS E GEPY LSPEPT LAEPILEVEVTLSKP PESEFSSSPLQ T
Sbjct 294 GMIPISVEESTPISCESGEPYCLSPEPT-LAEPILEVEVTLSKPIPESEFSSSPLQVSTS 352

Query 358 LASHTVEIHEPNGMVPSEDLEPEVESSPELAPPP--ACPSESPVPIAPTAQPEELLNGAP 415
L H E HEPNG PSEDLEPEVESS E APPP AC SES VPIAPTAQPEELLNGAP
Sbjct 353 LVPHRAETHEPNGVIPSEDLEPEVESSTEPAPPPLSACASESLVPIAPTAQPEELLNGAP 412

Query 416 SPPAVDLSPVSEPEEQAKEV-TASMAPPTIPSATPATAPSATSPAQEEEMEEEEEEEEGE 474
SPPAVDLSPVSEPEEQAKEV A A I S TP APS TS AQEEE EE
Sbjct 413 SPPAVDLSPVSEPEEQAKEVPSAALA--SIVSPTPPVAPSDTSAAQEEEIEED------- 463

Query 475 AGEAGEAESEKGGEELLPPESTPIPANLSQNLEAAAATQVAVSVPKRRRKIKELNKKEAV 534
E GEAESEKGGE L P STP PA LSQNLE AAA QVAVSVPKRRRKIKELNKKEAV
Sbjct 464 EDEDGEAESEKGGEDL-PLDSTPVPAQLSQNLEVAAAPQVAVSVPKRRRKIKELNKKEAV 522

Query 535 GDLLDAFKEANPAVPEVENQPPAGSNPGPESEGSGVPPRPEEADETWDSKEDKIHNAENI 594
GDLLDAFKE PAVPEVENQPP GSNP PESEGS P PEEA ETWDSKEDKIHNAENI
Sbjct 523 GDLLDAFKEVDPAVPEVENQPPTGSNPSPESEGSAALPQPEEAEETWDSKEDKIHNAENI 582

Query 595 QPGEQKYEYKSDQWKPLNLEEKKRYDREFLLGFQFIFASMQKPEGLPHISDVVLDKANKT 654
QPGEQKYEYKSDQWKPLNLEEKKRYDREFLLGFQFIFASMQKPEGLPHI DVVLDKANKT
Sbjct 583 QPGEQKYEYKSDQWKPLNLEEKKRYDREFLLGFQFIFASMQKPEGLPHITDVVLDKANKT 642

Query 655 PLRPLDPTRLQGINCGPDFTPSFANLGRTTLSTRGPPRGGPGGELPRGPAGLGPRRSQQG 714
PLR LDP RL GINCGPDFTPSFANLGR TLS RGPPRGGPGGELPRGPAGLGPRRSQQG
Sbjct 643 PLRSLDPSRLPGINCGPDFTPSFANLGRPTLSSRGPPRGGPGGELPRGPAGLGPRRSQQG 702

Query 715 PRKEPRKIIATVLMTEDIKLNKAEKAWKPSSKRTAADKDRGEEDADGSKTQDLFRRVRSI 774
PRKE RKII V MTEDIKLNKAEKAWKPSSKRTAADKDRGEEDADGSKTQDLFRRVRSI
Sbjct 703 PRKETRKIISSVIMTEDIKLNKAEKAWKPSSKRTAADKDRGEEDADGSKTQDLFRRVRSI 762

Query 775 LNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCLMALKV 834
LNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCLMALKV
Sbjct 763 LNKLTPQMFQQLMKQVTQLAIDTEERLKGVIDLIFEKAISEPNFSVAYANMCRCLMALKV 822

Query 835 PTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLKEELEEA 894
PTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLKEELEEA
Sbjct 823 PTTEKPTVTVNFRKLLLNRCQKEFEKDKDDDEVFEKKQKEMDEAATAEERGRLKEELEEA 882

Query 895 RDIARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLLKNHDEESLECLCRLLTTIGKDLD 954
RDIARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLLKNHDEESLECLCRLLTTIGKDLD
Sbjct 883 RDIARRRSLGNIKFIGELFKLKMLTEAIMHDCVVKLLKNHDEESLECLCRLLTTIGKDLD 942

Query 955 FEKAKPRMDQYFNQMEKIIKEKKTSSRIRFMLQDVLDLRGSNWVPRRGDQGPKTIDQIHK 1014
F KAKPRMDQYFNQMEKIIKEKKTSSRIRFMLQDVLDLR SNWVPRRGDQGPKTIDQIHK
Sbjct 943 FAKAKPRMDQYFNQMEKIIKEKKTSSRIRFMLQDVLDLRQSNWVPRRGDQGPKTIDQIHK 1002

Query 1015 EAEMEEHREHIKVQQLMAKGSDKRRGGPPGPPISRGLPLVDDGGWNTVPISKGSRPIDTS 1074
EAEMEEHREHIKVQQLMAKG DKRRGGPPGPP V DGGWNTVPISKGSRPIDTS
Sbjct 1003 EAEMEEHREHIKVQQLMAKGGDKRRGGPPGPP-------VNDGGWNTVPISKGSRPIDTS 1055

Query 1075 RLTKITKPGSIDSNNQLFAPGGRLSWGKGSSGGSGAKPSDAASEAARPATSTLNRFSALQ 1134
RLTKITKPGSIDSNNQLFAPGGRLSWGKGSSGGSGAKPSD ASEA RPA TLNRFSALQ
Sbjct 1056 RLTKITKPGSIDSNNQLFAPGGRLSWGKGSSGGSGAKPSDTASEATRPA--TLNRFSALQ 1113

Query 1135 QAVPTESTDNRRVVQRSSLSRERGEKAGDRGDRLERSERGGDRGDRLDRARTPATKRSFS 1194
Q P E TDNRRVVQRSSLSRERGEKAGDRGDRLERSERGGDRGDRLDRARTPATKRSFS
Sbjct 1114 QTLPVENTDNRRVVQRSSLSRERGEKAGDRGDRLERSERGGDRGDRLDRARTPATKRSFS 1173

Query 1195 KEVEERSRERPSQPEGLRKAASLTEDRDRGRDAVKREAALPPVSPLKAALSEEELEKKSK 1254
KEVEERSRERPSQPEGLRKAASLTE DRGRD VKREA LPPVSP KAAL E E KSK
Sbjct 1174 KEVEERSRERPSQPEGLRKAASLTE--DRGRDPVKREATLPPVSPPKAALAVDEVERKSK 1231

Query 1255 AIIEEYLHLNDMKEAVQCVQELASPSLLFIFVRHGVESTLERSAIAREHMGQLLHQLLCA 1314
AIIEEYLHLNDMKEAVQCVQELASPSLLFIFVR G ESTLERS IAREHMG LLHQLLCA
Sbjct 1232 AIIEEYLHLNDMKEAVQCVQELASPSLLFIFVRLGIESTLERSTIAREHMGRLLHQLLCA 1291

Query 1315 GHLSTAQYYQGLYEILELAEDMEIDIPHVWLYLAELVTPILQEGGVPMGELFREITKPLR 1374
GHLSTAQYYQGLYE LELAEDMEIDIPHVWLYLAEL TPILQE GVPMGELFREITKPLR
Sbjct 1292 GHLSTAQYYQGLYETLELAEDMEIDIPHVWLYLAELITPILQEDGVPMGELFREITKPLR 1351

Query 1375 PLGKAASLLLEILGLLCKSMGPKKVGTLWREAGLSWKEFLPEGQDIGAFVAEQKVEYTLG 1434
P GKA SLLLEILGLLCKSMGPKKVG LWREAGLSW EFL EGQD G FVAE KVEYTLG
Sbjct 1352 PMGKATSLLLEILGLLCKSMGPKKVGMLWREAGLSWREFLAEGQDVGSFVAEKKVEYTLG 1411

Query 1435 EESEAPGQRALPSEELNRQLEKLLKEGSSNQRVFDWIEANLSEQQIVSNTLVRALMTAVC 1494
EESEAPGQRAL EEL RQLEKLLK G SNQRVFDWIEANL EQQI SNTLVRALMT VC
Sbjct 1412 EESEAPGQRALAFEELRRQLEKLLKDGGSNQRVFDWIEANLNEQQIASNTLVRALMTTVC 1471

Query 1495 YSAIIFETPLRVDVAVLKARAKLLQKYLCDEQKELQALYALQALVVTLEQPPNLLRMFFD 1554
YSAIIFETPLRVDV VLK RA LLQKYL DEQKELQALYALQALVVTLEQP NLLRMFFD
Sbjct 1472 YSAIIFETPLRVDVQVLKVRARLLQKYLSDEQKELQALYALQALVVTLEQPANLLRMFFD 1531

Query 1555 ALYDEDVVKEDAFYSWESSKDPAEQQGKGVALKSVTAFFKWLREAE-EESDHN 1606
ALYDEDVVKEDAFYSWESSKDPAEQQGKGVALKSVTAFF WLREAE EESDHN
Sbjct 1532 ALYDEDVVKEDAFYSWESSKDPAEQQGKGVALKSVTAFFNWLREAEDEESDHN 1584



That's it,

Pierre

18 February 2011

A Data Scraper for Amazonia (expression)


Today, we had a lecture about the "human induced pluripotent stem cells", presented by John De Vos. He introduced Amazonia, a free web atlas that allows an easy query of public human transcriptome data. Although there is no web service (REST/SOAP) to access this data, I was interested in getting some profiles of expression from this database as it is something I've failed to achieve with NCBI/GEO.

I wrote the following java scraper:

  • Line 84: we search for a gene name
  • 88: if there is a http redirection, the gene has been found
  • 96: the HTML page is downloaded
  • 100-112: fix the HTML to create a valid XML document
  • 133: transform the HTML page to a DOM document
  • 135-151: use XPATH to find the images and the labels
  • 189-211; put the data into a java/SWING Dialog

Compilation

javac AmazoniaRobot.java

Execution

java AmazoniaRobot EIF4G1
Et voilà:


That's it !

Pierre

06 August 2010

A MediaWiki extension displaying the UCSC Genome Browser

Today I wrote an extension for mediawiki displaying an HTML <iframe/> to the UCSC Genome Browser. This extension will help my colleagues to annotate some candidate genes threw our local wiki.

This extension handles a new tag <ucsciframe> composed of three required parameters: 'chrom', 'start' and 'end'.

For example
<ucsciframe chrom="chr2" start="98987" end="9879899"/>
The source code for this extension is available at:and its documentation is available on www.mediawiki.org.

That's it !

Pierre

24 July 2009

Ajax/PHP/Mysql/Canvas Drawing a circular genome, my notebook.

I've been asked to draw a circular map of the genome. Some tools already exist, for example circos, a Perl program.



Jan Aerts is also writing pARP, a circular genome browser using Ruby and ruby-processing:


My data are stored in big database and it might take some time before all the data are processed and displayed. So my idea was to call the server with some asynchronous ajax queries, retrieve the chunks of data and display each chunk as soon it is returned by the server as soon as it is available.

The code below is a proof of concept. This code is ugly, I wouldn't code things like this for a real piece of software. As a source of data I've used the snp129 and the knownGene tables of the UCSC stored in a mysql database. The server was implemented using PHP.

Client Side

When the document is loaded, the <canvas> element is resized. A first AJAX query is sent to retrieve an array of density of the SNPs on the human chromosome 1. The JSON response is processed, the maximum number of SNPs is found and each item of this array is displayed on the canvas. After that, a second AJAX query is sent to retrieve the density of the genes.
<html xmlns="http://www.w3.org/1999/xhtml"><head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<script><![CDATA[
/** the canvas element */
var canvas = null;
/** radius of the canvas */
var radius=500;
/** AJAX request */
var httpRequest=null;
/** Graphics context */
var g=null;
/** length of chrom1 */
var CHR1_LENGTH =248000000.0;
/** window length (pb) */
var windowLength=0;
/** first track is snp129 */
var database="snp129";

/** ajax callback */
function paintSnps()
{
if (httpRequest.readyState == 4) {
// everything is good, the response is received
if (httpRequest.status == 200)
{
var jsondata=eval("("+httpRequest.responseText+")");
var counts=jsondata.counts;
//get the maximum of item
var max=0;
for(var i=0;i< counts.length;++i)
{
if(counts[i].count > max) max= counts[i].count*1.0;
}
var r1= radius/2.0;
if(database=="knownGene")
{
r1+= 2+radius/4.0;
}
//loop over the items
for(var i=0;i< counts.length;++i)
{
var a1= Math.PI*2.0*i/(1.0*counts.length);
var a2= Math.PI*2.0*(i+1)/(1.0*counts.length);

var r2= r1+(counts[i].count/max)*(radius/4.0);
//draw the item
g.beginPath();
g.moveTo( radius + Math.cos(a1)*r1, radius + Math.sin(a1)*r1);
g.lineTo( radius + Math.cos(a1)*r2, radius + Math.sin(a1)*r2);
g.lineTo( radius + Math.cos(a2)*r2, radius + Math.sin(a2)*r2);
g.lineTo( radius + Math.cos(a2)*r1, radius + Math.sin(a2)*r1);
g.stroke();
g.fill();
}
//if it was snp, then look for knownGene, change the coors
if(database=="snp129")
{
database="knownGene";
g.fillStyle = "yellow";
g.strokeStyle = "blue";
setTimeout("fetchDB()",100);
}
}
else
{
//boum!!
}
}
else {
// still not ready
}

}

/** calls the AJAX request */
function fetchDB()
{
httpRequest= new XMLHttpRequest();
httpRequest.onreadystatechange = paintSnps;
httpRequest.open('GET', 'ucsc.php', true);
httpRequest.send("length="+windowLength+"database="+database);

}

/** init document */
function init()
{
canvas=document.getElementById("genome");
//resize canvas
canvas.setAttribute("width",2*radius);
canvas.setAttribute("height",2*radius);
if (!canvas.getContext) return;
g = canvas.getContext('2d');
//paint background
var lineargradient = g.createLinearGradient(radius,0,radius,2*radius);
lineargradient.addColorStop(0,'white');
lineargradient.addColorStop(1,'black');
g.fillStyle = lineargradient;
g.fillRect(0,0,2*radius,2*radius);
g.strokeStyle = "black";
g.strokeRect(0,0,2*radius,2*radius);
g.fillStyle = "red";
g.strokeStyle = "green";

var perimeter= 2*Math.PI*(radius/2.0);
windowLength = Math.round(CHR1_LENGTH/perimeter);

//launch the first ajax request
setTimeout("fetchDB()",100);
}


]]></script>
</head><body onload="init();">
<canvas id="genome" />
</body></html>

The server

The (ugly) PHP page is a simple script returning the density of the objects mapped on the chromosome 1 for a given table.
<?php
$con=NULL;

function cleanup()
{
if($con!=NULL) mysql_close($con);
flush;
exit;
}

header('Cache-Control: no-cache, must-revalidate');
header('Content-type: application/json');
header("Content-Disposition: attachment; filename=\"result.json\"");
header('Content-type: text/plain');

$con = mysql_connect('localhost', 'anonymous', '');
if (!$con) {
echo "{status:'Error',message:'". mysql_error()."'}";
cleanup();
}
if(!mysql_select_db('hg18', $con))
{
echo "{status:'Error',message:'cannot select db'}";
cleanup();
}
$database="snp129";
if(isset($_GET["database"]))
{
$database=$_GET["database"];
}


$length=1E6;
if(isset($_GET["length"]))
{
$length= (int)$_GET["length"];
}
if($length<=0) $length=1E6;

$nameStart="chromStart";
if($database=="knownGene")
{
$nameStart="txStart";
}


$sql="SELECT CAST(ROUND(".$nameStart."/".$length.") AS SIGNED INTEGER )*".$length.",count(*) from ".$database." where ".
" chrom=\"chr1\" ".
" group by CAST(ROUND(".$nameStart."/".$length.") AS SIGNED INTEGER )*".$length.
" order by 1"
;

$result = mysql_query($sql ,$con );

if(!$result)
{
echo "{status:'Error',message:'".mysql_error($con) ."'}";
cleanup();
}

$found=FALSE;


echo "{status:'OK',";
echo "length:".$length.",";
echo "counts:[";

while ($row = mysql_fetch_array($result))
{
if($found) echo ",\n";
$found=TRUE;
echo "{chromStart:".$row[0].",count:".$row[1]."}";
}

echo "]}";

cleanup();

?>
And here is the kind of JSON document returned by the server:
{status:'OK',
length:1000000,
counts:[
{chromStart:0,count:6191},
{chromStart:1000000,count:8897},
{chromStart:2000000,count:5559},
{chromStart:3000000,count:6671},
{chromStart:4000000,count:6398},
{chromStart:5000000,count:5462},
{chromStart:6000000,count:5678},
{chromStart:7000000,count:4737},
{chromStart:8000000,count:5313},
{chromStart:9000000,count:5148},
{chromStart:10000000,count:4055},
{chromStart:11000000,count:5012},
{chromStart:12000000,count:5363},
{chromStart:13000000,count:10165},

(...)

{chromStart:239000000,count:5502},
{chromStart:240000000,count:6173},
{chromStart:241000000,count:7928},
{chromStart:242000000,count:3800},
{chromStart:243000000,count:5503},
{chromStart:244000000,count:7120},
{chromStart:245000000,count:6148},
{chromStart:246000000,count:6015},
{chromStart:247000000,count:5337}
]
}

Result




That's it

PS: Hum, yes I know , it's not as fast/beautiful as GenoDive that was introduced at Biohackathon.



Pierre

19 July 2009

3D histograms using CSS -moz-transform

Firefox 3.5 includes a new CSS property called -moz-transform. The -moz-transform CSS property lets you modify the coordinate space of the CSS visual formatting model. Using it, elements can be translated, rotated, scaled, and skewed as this text..

I've used this new property to draw a 3D histogram:

[0,0]
124
99%
[0,1]
95
77%
[0,2]
87
68%
[0,3]
72
54%
[0,4]
60
[0,5]
50
43%
[1,0]
139
78%
[1,1]
137
64%
[1,2]
108
63%
[1,3]
81
[1,4]
67
40%
[1,5]
57
38%
[2,0]
177
59%
[2,1]
137
55%
[2,2]
129
[2,3]
102
40%
[2,4]
80
38%
[2,5]
61
35%
[3,0]
181
58%
[3,1]
167
45%
[3,2]
149
42%
[3,3]
123
33%
[3,4]
117
26%
[3,5]
108
20%
[4,0]
237
45%
[4,1]
194
45%
[4,2]
191
35%
[4,3]
152
26%
[4,4]
123
22%
[4,5]
118
18%
[5,0]
300
40%
[5,1]
235
31%
[5,2]
208
28%
[5,3]
165
23%
[5,4]
149
19%
[5,5]
131
16%


Here is the code for a simple cube:

<-- left pane -->
<div style="position:absolute; -moz-transform-origin: 0px 0px; -moz-transform: translate(300px,200px) rotate(90deg) skew(-45deg); background:gray; font-size:36px; color:white; width:300px; height:40px; border:1px solid black;text-align:right;overflow:hidden;">[0,0]</div>

<-- right pane -->
<div style="position:absolute; -moz-transform-origin: 0px 0px; -moz-transform: translate(340px,160px) rotate(90deg) skew(45deg); background:lightGrey ; font-size:36px; color:white; width:300px; height:40px;border:1px solid black;textalign:right;overflow:hidden;">124</div>

<-- top pane -->
<div style="position:absolute; -moz-transform-origin:0 0; -moz-transform: translate(300px,120px) skew(-45deg, 45deg); background:dimgray ; font-size:18px; color:white; width:40px; height:40px;border:1pxsolid black;text-align:center;overflow:hidden;" title="300">99%</div>


If you don't have firefox 3.5 here a screenshot showing how my browser displays this page (at the top, the same page viewed in Konqueror)


That's it
Pierre