PDA

Επιστροφή στο Forum : Νέα από το Rosetta@home



dhatz
03-06-06, 22:36
Μια σύντομη περιγραφή Rosetta@home (http://boinc.bakerlab.org/rosetta)

Οι επιστημονικοί στόχοι είναι 3, οπως λεει και το logo:

http://boinc.bakerlab.org/rosetta/rah_images/rosetta_at_home_logo.gif

1/ protein folding, δηλ. 3D (tertiary) structure prediction. Ως γνωστόν, σχεδόν όλες οι ασθένειες εμφανίζονται στο επίπεδο της λειτουργίας των πρωτεϊνών. Ομως η πλειοψηφία των πρωτεινων στο ανθρώπινο σώμα είναι ακόμα αγνωστες. Φανταστείτε να προσπαθούσατε να διορθώσετε έναν υπολογιστή, γνωρίζοντας τι κάνουν τα μισα εξαρτήματα. Με την ολοκλήρωση του Human Genome Project το 2002 (ισως το πιο σημαντικό επιστημονικό επιτευγμα του αιώνα), έχουμε τις "γενετικές οδηγίες" κατασκευής των ~400.000 πρωτεινων στο σώμα μας, αλλά όχι την τελική μορφη. Αυτό το πρόβλημα προσπαθούν να λύσουν τα protein prediction projects (Rosetta, Predictor, TANPAKU κλπ). Στο μεταξύ οι 3D δομές των πρωτεινών επιλύονται αργά (βδομάδες ή και μήνες ανά πρωτείνη) και δαπανηρά ($100.000) σε εργαστήρια, με X-ray crystallography η NMR.

2/ protein design (έχουν σχεδιάσει και υλοποιήσει την πρώτη "τεχνητή" πρωτείνη)

3/ protein-ligand docking για εφαρμογές virtual screening για ευρεση νέων φαρμάκων, οπου π.χ. τεστάρουν δις χημικές ουσίες (μικρά μόρια) για να βρουν ποιά μπορούν να μπλοκάρουν κάποια πρωτείνη -protein inhibitors- (οι πιο πολλά υποσχόμενες θεραπείες για καρκίνο την εποχή μας).

Κατα το τελευταίο 2μηνο, το Rosetta@home είναι το 2ο γρηγορότερα αναπτυσσόμενο BOINC project, και αυτή τη στιγμή έχει ΕΝΕΡΓΟΥΣ κάπου 30.000 χρήστες και 55.000 computers, βλ.

http://www.boincstats.com/
http://www.boincstats.com/stats/project_graph.php?pr=rosetta

εγγράφονται περίπου 400-800 νέοι χρήστες κάθε μέρα. Μαζί με το CPDN είναι το απαιτητικότερο BOINC project από πλευράς μνήμης RAM (οι επίσημες προδιαγραφές ζητούν PCs με 512ΜΒ, αν και το ίδιο το science software χρειάζεται μέχρι και 200ΜΒ/CPU) και ADSL.

Τα παραπάνω στα αγγλικά, πιο αναλυτικα, με screenshots και επεξηγηματικές εικόνες και πολλά link σε συναφη θεματα στην Wikipedia:

http://en.wikipedia.org/wiki/Rosetta%40home

dhatz
03-06-06, 23:02
Περισσότερα νέα απο το Rosetta@home:

Να σημειώσω ότι είναι εύκολο να παρακολουθεί κανεις τα νέα τους, γιατι η επιστημονική ομάδα (που είναι κυριολεκτικά διεθνής. μάλιστα πρόσεξα και το όνομα ενός Ελληνα εκεί) δίνει σχεδόν καθημερινά feedback για την πορεία της έρευνας, π.χ. ο head scientist David Baker, 43 ετών, διατηρεί journal στο

David Baker's Rosetta@home journal:
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1177

σχετικά με το τι ασχολούνται ανα πάσα στιγμή, τι είδους workunits "τρέχουν" στα PCs ημών των "δωρητών CPU time" κλπ και απαντά σε σχετικές ερωτήσεις σε παράλληλο thread.

DISCUSSION of Rosetta@home Journal
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1635
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1178

Συλλογή από χρήσιμα links για το Rosetta@home
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=3

Τι WUs ετρεχαν απο Νοε-05 μέχρι Μαι-06
Απο Ιαν-Μαι 2006 τα workunits ηταν κυρίως για 3D structure prediction σε ΓΝΩΣΤΕΣ πρωτεινες, για βελτίωση και αναπτυξη νέων καλύτερων αλγορίθμων.

Τι WUs θα τρέχουν απο Μαι-06 μέχρι Αυγ-06
Απο τον τελευταίο μήνα, στα πλαίσια του CASP7 τρέχουν WUs που αφορούν πλέον ΑΓΝΩΣΤΕΣ πρωτείνες. Οι οποίες αγνωστες πρωτεινες παράλληλα αυτη τη στιγμή επιλύονται ΠΕΙΡΑΜΑΤΙΚΑ σε διάφορα εργαστήρια ανά τον κόσμο (με X-ray crystallography). Ωστε να πιστοποιηθεί η ακρίβεια των σχετικών μεθόδων "ΜΑΘΗΜΑΤΙΚΗΣ" επίλυσης των 3D protein structures.

Το CASP experiment (http://predictioncenter.org/) είναι ένα διεθνές ανοικτό "πείραμα", όπου ανα 2-ετία οι επιστημονικές ομάδες απ' όλο τον κόσμο, που ασχολούνται με το folding / protein prediction "συναγωνίζονται" για να δούν πόσο επιτυχημένη είναι η τεχνική που έχουν αναπτύξει.

τα targets του φετινού CASP7 είναι στο
http://predictioncenter.gc.ucdavis.edu/casp7/targets/cgi/casp7-view.cgi?loc=predictioncenter.org;page=casp7/

dhatz
05-06-06, 00:46
Το Rosetta software (που αναπτύσσεται τα 9 τελευταία χρόνια), χρησιμοποιείται από πολλά ερευνητικά κέντρα στον κόσμο και για διάφορες εφαρμογές.

Η γνωστότερη είναι το protein structure prediction ή αλλιως folding π.χ. απο το Human Proteome Folding (http://en.wikipedia.org/wiki/Human_Proteome_Folding_Project - που τρέχει το grid της IBM/WCG και grid.org μαζί με το FightAIDS@home) για την διακρίβωση των 3D structures των άγνωστων πρωτεϊνών στο ανθρώπινο σώμα. Στο ίδιο θέμα είναι και το ανα διετία διενεργούμενο CASP experiment που ανέφερα στο προηγούμενο post και θα παιζει μέχρι αρχές Αυγ-2006.

Η δευτερη σημαντικη εφαρμογη του Rosetta είναι η σχεδίαση τεχνητών πρωτεϊνών (protein design), πχ για χρηση σε γονιδιακές θεραπείες (π.χ. για την καταπολέμηση του καρκίνου):


Cancer: Cancer can be caused by mutations in key genes that disrupt normal cellular control processes. We are developing methods for cutting DNA at specific sites in the genome, and we will be targeting sites that are implicated in cancer. After these sites are cut, they should be repaired by the cell using a second, unmutated copy of the gene and the cell should no longer be cancerous. This is a very specific form of gene therapy that, if successful, will circumvent one the main objections to current gene therapy methods; namely, current methods insert the unmutated copy of a gene randomly into the genome, and if the insertion point happens to be near an oncogene, the gene therapy will cure one disease but cause another. Because our methods will target specific sites instead of random sites, they should avoid this pitfall.
Πηγη http://boinc.bakerlab.org/rosetta/rah_medical_relevance.php

Αυτη την βδομάδα δημοσιεύονται στο Nature περισσότερα σχετικά, σχετικά με τη σχεδίαση με το Rosetta, μιας νέας (τεχνητά κατασκευασμένης πλέον) πρωτείνης που θα "κοβει-ράβει" και θα διορθώνει το αλλοιωμένο-μεταλλαγμένο DNA:


Protein engineering: OK Computer (pp 656-659)

One of the great remaining problems in computational protein design involves the redesign of a DNA-modifying protein so that it recognizes, and alters, a new DNA sequence. For example, changing the specificity of a nuclease a protein that cuts DNA at a specific site could be beneficial for a range of biotechnological and medical applications.
In this week's Nature, David Baker and colleagues have shown that it is possible to modify the sequence specificity of a "homing endonuclease" called I-MsoI. They used a computational approach to screen a virtual library of mutant proteins and predicted which amino acids needed to be changed to re-engineer this enzyme so that it recognized, and cleaved, a new DNA sequence. The mutant protein was highly active and was able to cleave the new DNA sequence, but did not modify the original sequence. The authors hope to redesign this and other DNA-modifying enzymes to alter a range of DNA sequences, so that they could specifically target almost any sequence in the genome. These computationally designed proteins may be useful in a range of medical and biotechnological applications, including gene therapeutic and other targeted genomics applications.

dhatz
15-06-06, 00:02
Οπως ανέφερα σε προηγούμενο μήνυμα, ένας απο τους βασικούς στόχους του Rosetta@home (όπως και των άλλων σχετικών projects: Predictor@home, TANPAKU και παλιότερα το Distributed Folding) είναι το protein structure prediction.

Αυτό σημαίνει να διακριβωθεί "ΥΠΟΛΟΓΙΣΤΙΚΑ" (στον υπολογιστή) η "τελική" 3D μορφη κάθε πρωτείνης στον ανθρώπινο οργανισμό και σε άλλους οργανισμούς. Αυτή η διαδικασία σήμερα γίνεται ακόμα ΠΕΙΡΑΜΑΤΙΚΑ (σε εργαστήρια, με X-ray crystallography ή NMR) με μεγάλο κόστος σε χρόνο (βδομάδες ή και μήνες) και χρήμα.

Η γνώση των πρωτεϊνών ειναι εξαιρετικά χρήσιμη, γιατί σχεδόν ολες οι ασθένειες εμφανίζονται στο επίπεδο της δραστηριότητας των πρωτεινων.

Αντιγράφω μια περιγραφη της "πειραματικής" (στο εργαστήριο με X-ray crystallography) διαδικασίας που έστειλε σήμερα κάποιος στο φορουμ το R:

http://boinc.bakerlab.org/rosetta/forum_thread.php?id=1801

Just to show how difficult it is to do an X-ray analysis of a protein:

A colleague of mine was investigating the structure of a protein called "violet-colored acid phosphatase". To do so she first had to process hundreds of kilos of sweet potatoes. Then she had to pre-separate the different ingredients by methods I do not really know any more, something like precipitating things with special chemicals. Then the protein containing rest was subjected to gel-electrophoresis and tiny amounts of the protein were isolated. This had to be done with numerous samples until in the end they got 500 mg of the protein.

Next step was crystallization, which again took many samples and a lot of time. Finally they had some single crystals they could use for structure determination, but it turned out the protein tended to disintegrate under X-rays. They finally got time at the DESY where thy could do measurements with synchrotron radiation which worked out (much higher intensity, thus much shorter measuring time).

The step of solving the structure is the next ordeal: carbon, nitrogen and oxygen have almost identical electron densities and will thus show up in the electron density map the x-ray yields as hardly distinguishable "bulges". Hydrogen, very important to distinguish those atoms, is hardest to find as it has only one electron and is thus just a speck in a sea of possible bulges and will only be seen, when the electron densities of heavier atoms are assigned correctly. So you just work your work forward, guessing the backbone together, assigning CH, CH2, CH3, NH, NH2, and OH groups where you think the heavier atoms carry hydrogen and slowly go towards your aim of a totally solved structure.

What you also have to account for is water which also shows up the way the backbone- and residue atoms do and often adjacent water molecules emulate structures of the backbone or residues, so you have to separate this from the protein itself (when the assignment of atom species is only a rough one, the atoms seam "smeared", so you cannot exactly determine their disctances, which complicates the separation of water and protein a lot).

So it is a long and troublesome thing to do (at least it was ten years ago, maybe things have improved a little with faster PCs and better search algorithms). It mainly depends on your experience, how good you proceed.

The whole thing took three years and three people were working on it in different fields and with different samples (one tried to extract the enzyme from uteri of pregnant pigs). Some attempts were simply fruitless and you need a high frustration threshold to work in that field.

After going through all that labour, trouble and waiting comes the next problem: you have treated the enzyme so brutal and exposed it to so many chemical and physical procedures. Who is now going to guarantee you still have a native enzyme in your crystal? Maybe all those changes in pH and chemical composition of the solution has denaturated your target.

So, to me, the bottom line is: folding is not only much less troublesome, cheaper and faster, it is also much likelier to yield the correct structure. And that is what you finally want.

dhatz
28-11-06, 14:23
Το "docking" μπορεί να είναι μεταξύ:

protein-ligand dockling (ligand = small chemical molecule -αυτο για να είναι υδατοδιαλυτό και να μπορεί να κυκλοφορεί μέσα στον οργανισμό του ανθρώπου-. Τα περισσότερα φάρμακα ανήκουν σ'αυτή την κατηγορία)

ειτε protein-protein docking

Εχει και καποιες σελίδες η Wikipedia, αλλά ειναι "stubs"

http://en.wikipedia.org/wiki/Molecular_docking
http://en.wikipedia.org/wiki/Protein-ligand_docking
http://en.wikipedia.org/wiki/Protein-protein_docking

Σημ: Υπάρχουν διάφορα "docking" simulation software, με διαφορετικές δυνατότητες (AutoDOCK, THINK, το Rosetta κλπ)

Το Rosetta@home ξεκίνησε έρευνα στο protein-protein docking:

Protein-protein docking is a computational task which aims at predicting the structure of a protein complex given the structures of each individual protein partners are solved. In the update 5.32, we make Rosetta protein docking protocol compatible with Rosetta@Home so that we can take advantage of the computational power brought by the BOINC distributed computing technology and of course the generous contribution from users all over the world. Thank everyone for the help !

My name is Chu Wang and I am a graduate student in Dr. Baker's lab working on developing new methodology to better understand the protein docking problem. Below is some background information about this project.

Docking (in biology) refers to the computational technique aiming at predicting the interaction between two or even more biological molecules. Such interactions could be between proteins, proteins and DNAs(or RNAs), or proteins and small chemical compounds (ligands). So docking can be classifed as protein-protein docking, protein-DNA docking and protein-ligand docking.

Protein interactions are very important in biology because proteins do not act alone and they have to "talk" to each other in order to accomplish any biological process. Solving the strucure of a protein complex can provide a mechanisitc base for understanding how a biologcial signal is transducted or how a biological function is carried out. It is much more time-consuming and technically chanllenging to solve the structure of a protein complex experimentlly, and thus developing computational methods to approach this problem becomes of great interest for many groups.

Besides predicting the structure of a protein from its sequence, Rosetta has also been developed to handle various type of docking problems as mentioned above, including protein-protein docking. In our standard protein-protein docking protocol, we start with two protein structures in space, firstly carry out a very fast but crude search to find a rough shape fit between these two proteins. During the first stage, the proteins are represented by only backbones (which defines the shape) and one pseudo atom for sidechains (that is why it is fast). Afterwards, sidechain atoms are added back and the docking protocol enters the full-atom refinement stage in which the relative orientation between the two proteins and the detailed sidechain interactions across the interface are optimized simultaneously. Each trajectory will end up with a model with certain docking oridentation and we also have an energy function to rank them.

The complexity of a docking problem can vary a lot. Proteins are flexible and dynamic biological molecules which means that its 3-D structure may change under a different condition. Such flexibility can be observed in both backbone and sidechain level. So it is very possible that the protein structures in their isolated (unbound) form we start with may look different from those in the final complex (bound) form.
1. If no internal freedoms are considered for each protein, it is more like docking two "rocks" together and it is called "rigid-body" docking. Only six parameters are varibable and they are translation and rotation to decribe the relative orientation of the two proteins.
2. As mentioned above, the current standard Rosetta docking method takes the sidechain flexibility into consideration though the protein backbones are still being fixed. We may call this approach as "semi rigid-body" docking.
3. The next level of docking problem is "flexible-backbone" docking, which is to allow protein backbones to vary as well. This is a very challenging as in addition to sampling the rigid-body orientation, we will also have to take care of the "folding" problem of TWO proteins.

Similar to the CASP experiment for structure prediction, there is a blind docking prediciton experiment -- CAPRI, in which two protein structures are provided and participants are asked to predict the sturcture of the complex. Using Rosetta, the Baker Lab team has submitted high-quality predictions for several targets for which backbones do not vary very much between the unbound form and bound form and some of these predictions are even accurate at atomic resolution( both backbone and sidechain are correct). This has shown the strength of our protocol in allowing sidechains to be optimized.

However, an important lesson learned from the CAPRI experiments is that the current bottleneck for developing protein docking methods is how to treat backbone flexibility as there were almost unanimous failure for the targets with backbone movements upon forming the complex. Currently, I am working to develop new approaches to consider backbone flexibility in our Rosetta docking method and I believe the compuational power provided by the BOINC and millions of millions people who volunteer to donate their computer resource is a key factor to the success of this challenging project.
http://boinc.bakerlab.org/rosetta/forum_thread.php?id=2395

@ ADSLgr.com All rights reserved.