Science topics: BioinformaticsBioinformatics and Computational Biology
Science topic
Bioinformatics and Computational Biology - Science topic
Explore the latest questions and answers in Bioinformatics and Computational Biology, and find Bioinformatics and Computational Biology experts.
Questions related to Bioinformatics and Computational Biology
I'm using autodock vina in Python to dock multiple proteins and ligands, but I'm having trouble setting the docking parameters for each protein. How can I do this in Python? (I have attached my py code which I have done in this I have assumed this parameters same for all proteins)
I'm on the lookout for remote bioinformatics and computational biology opportunities where I can actively contribute to research projects. Compensation is not a priority for me; my main focus is to gain hands-on experience in these fields.
#biopython
#computational_biology
#bioinformatics
#biology
#R
I know many websites have simple tools like transcription and translation available, but are there any analysis tools that researchers need that either do not exist or are not publicly available? It could be anything from algorithms to visuals. Thanks!
How can I dock more than one protein with more than one ligand, I know that pyrx is the software which docks 1 protein with multiple ligand but how can I do it for multiple proteins with multiple ligands?
Is there any server or tools (bioconda, java, etc.) to exclusively annotate membrane protein only (similar to dbCAN for polysaccharides) from a bacterial genome?
Thank you in advanced!
I am attempting to use the Seurat FindAllMarkers function to validate markers for rice taken from the plantsSCRNA-db. I want to use the ROC test in order to get a good idea of how effective any of the markers are. While doing a bit of research, different stats forums say: "If we must label certain scores as good or bad, we can reference the following rule of thumb from Hosmer and Lemeshow in Applied Logistic Regression (p. 177):
0.5 = No discrimination 0.5-0.7 = Poor discrimination 0.7-0.8 = Acceptable discrimination 0.8-0.9= Excellent discrimination0.9 = Outstanding discrimination "
For more background, the output of the function returns a dataframe with a row for each gene, showing myAUC: area under the Receiver Operating Characteristic, and Power: the absolute value of myAUC - 0.5 multiplied by 2. Some other statistics are included as well such as average log2FC and the percent of cells expressing the gene in one cluster vs all other clusters.
With this being said, I would assume a myAUC score of 0.7 or above would imply the marker is effective. However given the formula used to calculate power, a myAUC score of 0.7 would correlate to a power of 0.4. So with this being said, would it be fair to assume that myAUC should be ignored for the purposes of validating markers? Or should both values be taken into account somehow?
I want to do a simulation with Molecular Dynamics but lack the facilities.
RNA docking using autodock has a different approach to deal with. What are the steps that are required to compute the gasteiger charges in particular?
There are so many softwares for docking but which one is best? On which we have to rely?
These datasets will be used for data classification and predicting new information
I currently use .csv files to work with pandas dataframes and perform UMAP analyses and I would like to use Scanpy moving forward. Can anyone help me with converting .csv files into Anndata files for Scanpy?
Hello!!!
I would like you to help me with information about full predoctoral or doctoral fellowships in areas such as bioinformatics, computational biology, microbiology or related fields to which I can apply.
I would be very grateful if you could recommend some of them.
Fausto Cabezas-Mera
Greetings from Ecuador
I created this R package to allow easy VCF files visual analysis, investigate mutation rates per chromosome, gene, and much more: https://github.com/cccnrc/plot-VCF
The package is divided into 3 main sections, based on analysis target:
- variant Manhattan-style plots: visualize all/specific variants in your VCF file. You can plot subgroups based on position, sample, gene and/or exon
- chromosome summary plots: visualize plot of variants distribution across (selectable) chromosomes in your VCF file
- gene summary plots: visualize plot of variants distribution across (selectable) genes in your VCF file
Take a look at how many different things you can achieve in just one line of code!
It is extremely easy to install and use, well documented on the GitHub page: https://github.com/cccnrc/plot-VCF
I'd love to have your opinion, bugs you might find etc.
I want parameterise the ZN metal, which is coordinated with CCCH (three CYS and one HIS residues) residues. I just followed MCPB tutorial. While side chain modelling i got errors and unable to fix the problem. Here, i have attached my pdb file , sidechain.bcl file and sidechain.bcl log files.
I upload a genome to check using Busco via galaxy server. Currently, it is 2 days and the result is not finished yet?
Did I miss something or is there is a problem?
Thank you in advanced
I have a PDB file of a branched polymeric chain, I want to simulate its water solvation using "openMM"
My problem is to find an amber force field to fit that branched chain.
By the way, I have the force field file (XML) which fits the linear polymeric chain (attached)
Good day! The question is really complex since CRISPR do not have any exact sequence - so the question is the probability of generation of 2 repeat units, each of 23-55 bp and having a short palindromic sequence within and maximum mismatch of 20%, interspersed with a spacer sequence that in 0.6-2.5 of repeat size and that doesn't match to left and right flank of the whole sequence, in a random sequence.
I want to dig into machine learning for drug discovery, Can anyone suggest me some good reads from where and how to start, what prerequisites needs to be checked and is there any publicly available material online?
hello
Please introduce me the companies that provide biotechnology services such as designing different types of primers, NGS, RNASeq, etc.
The workstation I have been using takes up nearly 15-20 days for a SPC model simulation. Also, I could not run in the HPC's as my simulation generates huge amount of data (which takes up lot of memory). So, I am planning to buy a system to run simulations in my home itself. Suggest me the best specifications I would require?
PS: I already have one PC with 4 GB RAM with Intel core 2 duo. Can I add an external workstation motherboard?
Certain softwares and sites allow to calculate a DNA hairpin Tm depending on the size of the loop and the stem sequence. For example, Gene Runner. Yet the calculation method or citation is not provided. Is there a formula that could help?
Hello everyone,
I have designed this guide on bioinformatics for the Univ of Florida and would greatly appreciate it you suggestions/ comments on what else I am missing and should include. Thanks, Rolando
I have a protein with PDB (1ZK4), it obtains NADP as ligand in its structure when I tried to dock it, I faced error while creating the PDBQT file for this protein in AutoDock :Error: "Non-integral charge on residues" and the second error during grid run: Error:"Found an H-Bonding atom with three bonded atoms, atom serial 1903". When I removed NADP from my receptor protein I got the proper docking results. Can anyone explain why it was so?
Dear all,
Enclose here is the grapgh of PMF of a small drug molecule in two different lipid systems using Gromacs steered MD and Umbrella sampling methods.
The PMF for both systems are showing a negative energy value. PMF comparison indicating a significant difference at the hydropholic region of for my drug transport.
Is it right to say that the drug transport is more spontaneous (black graph/System-I) as compared to the green graph/System-II?
I hope to receive your comments and suggestions for a fruitful interpretation.
Thanking you
Sincerley
Bikash
I'm trying to find GC methylation percent of a specific gene promoter in mouse ESCs, I found http://imethyl.iwate-megabank.org and http://www.methdb.net websites as available databases, however, after putting my favorite gene ID number I was unable to interpret all peaks shown in those website, I was wondering if anybody knows how can I interpret and present those information? OR is there any other way to find methylation status of a gene? Thanks,
During the addition of ions Na and Cl to the system in the sol, the program threw the error stating that "no line with molecule 'SOL' found in the [molecules] section of file 'topol.top'.
While the file topol.top has the entry in it. please suggest how to rectify the errror.
Thanks in advance.
Regards,
Vinay
During the topology generation of protein (PDB id: 6lu7), a fatal error occurred. How to resolve this error and what precautions should be taken to avoid such errors?
Thanks in advance.
Regards,
Vinay
I have performed a simulation via Desmond, now i want to perform MMPBSA. I dont think Desmond has any functionality of calculating MM/PBSA. I wonder how to move forward? Youre guidance will be highly appreciated.
Thanks
After finishing the simulation of the cyclic peptide, I tried to find the most populated structure using the cluster peak density algorithm. from the literature, the representative structure was chosen as the structure with maximal ρsum (The summation of local densities of all residues in one structure, ρ𝑠𝑢𝑚 = ∑ ρ𝑖𝑛_𝑟𝑒𝑠𝑖=1) so how can I extract the structure which has the highest density for the all residue?
ref: Clustering by Fast Search and Find of Density Peaks. Science 2014, 344, 1492–1496
Hi,
GO and KEGG functional analysis for a gene set was using the DAVID database (https://david.ncifcrf.gov/). However, the adjusted p-values (Bonferroni and Benjamini) of the enriched GO terms and KEGG pathways were more than 0.5. Meanwhile, a PPI network was constructed using the STRING database (https://string-db.org). The network was constructed with a confidence score of 0.4 was set as the cutoff criterion with no more than ten as the maximum number of interactions in the first shell. This step added a few more genes to the gene list, and genes with no interactions were removed. When the updated gene list was used for GO and KEGG functional analysis, the enriched GO terms and KEGG pathways were now significant (p-value < 0.05). Is the attempted workflow valid?
How to calculate the ratio of the number of mismatches between reference and reads to the number of all mapped bases at each reference position when I got bam file? Comments on any program or script or any suggestions is welcome.
Is it possible to use Artificial Intelligence (AI) in Biological and Medical Sciences to search databases for potential candidate drugs/genes to solve global problems without first performing animal studies?
I have not much experience in bioinformatics and I need to find what are the common genes in several gene expression datasets, in other words, I need to find genes that match in all (or some) of my datasets. I am looking for some kind of tool that give me Venn diagrams with the coincident genes. Any suggestion (free software plese) will be very appreciated.
I'm working in use PSO for local alignment of ADN sequences, but I couldn't find a way to represent the alignment or gaps in the alignment.
Any opinion will be useful. Thanks
I would like to perform some high scale virtual screening with PyRx, docking libraries of compounds to the active site of a protein. To do that, the software needs all the .sdf (or .pdb or any other coordinate format) of the small molecules that I'd like to try. Form available online libraries usually all the molecules parameters are listed in a singular file containing all the thousands of molecules. Do you know if there is a fast way to extrapolate the singular .sdf from those kind of files? Is there a tool to obtain all the singular .sdf files from the mother one?
Thank you!
Dear RG members,
I am trying to install AMBER in parallel in one cluster having ifort compiler. I am getting the error MPIF90 command not found. I read the configure2 file in AmberTools/src, that tells I need to install serial first.
for searial
setenv AMBERHOME "amber path"
./configure -noX11 intel
make install
It is running perfectly.
How can I proceed to add MPI run.
Kindly suggest the next commands.
Should I hit
./configure -mpi intel
make install
or some other tricks.
I am failing each time with few errors.
Kindly share the complete commands after the searius installation steps.
I would like to study the apo form (lipid-free) of a protein that only has been crystallized with lipids. I want to explore if it is possible to generate with a molecular dynamic a reasonable structure, making subtraction of lipids in several steps until obtaining the apo form. Likewise, I don't know if, during the molecular dynamic trajectory, it is possible to disappear lipids. I am thinking of using programs like GROMACS, AMBER, etc.
Is it possible to do 3D-QSAR without using commercial software? If so, can anybody develop a workflow for doing 3D-QSAR with the suitable free software in each step?
I have used P2Rank in the PrankWeb software and the CASTp tool to analyze the refined structures of some proteins to predict protein pockets and cavities. But now I am not finding any clue to visualize them in PyMOL.
I am currently trying to find homologues of a protein I am working with, but BLAST has been giving me nothing useable. I have now found a dataset of 1500 protein sequences of potential candidates that I want to align to my reference sequence. I have tried Clustal, Mega, Muscle, MAFFT and pretty much everything under the sun, but with this many sequences and only limited experience, I am having trouble achieving what I want to do, as the programs simply crash or lock up after a few minutes..
Instead of the traditional multiple sequence alignment, where every sequence gets aligned to every other sequence with multiple iterations, I want all of the sequences from the dataset to only be aligned to my one reference sequence. Think of it as doing 1500 pairwise alignments only. What would be the best way to perform this kind of alignment?
After installing CASTp plugin to pyMOL, I tried to use it to view a protein and its pockets fro CASTp server through its job ID. But this error message occurs always
Applications of bioinformatics in medicine is a key factor in technological advancement in the field of modern medical technologies.
In which areas of medical technology are the technological achievements of bioinformatics used?
What are the applications of bioinformatics in medicine?
Please reply
I invite you to the discussion
Thank you very much
Best wishes
So my last year project is Drug Efflux Pumps and Persistence in Methicillin Resistant Staphylococcus aureus and we gonna focus on persister cells to study the path way of antimicrobial resistance...my question is how can i link bioinformatics and some coding to this project without requiring wgs cause it's not an option inside our lab !I need a small yet beneficial technique/ tools in small scale that i can learn and implement by my self .PS I love programming in general but im still new to bioinformatics so i need help to link my passion for coding and my field "biotechnology"
I have used P2Rank and the CASTp tool to analyze the refined structures of some proteins to predict protein pockets and cavities. But now I am not finding any way of visualizing them in PyMOL.
How I can identify differentially expressed genes from a particular gene family using the GEOdatasets? Which R package is best for differential gene expression analysis?
Is there any open-source software I can use to generate the images needed for NEB calculations? I will be using NEB as implemented by DMol3. Thanks in advance!
I have more then 10 sequence and I want to search for homologs of each sequence. I would like to use PSI-BLAST to retrieve this information from database. But I would like to do it at one go.. I don't want to retrieve this information one at a time.
Could anyone tell me any web based s/w or on-line tool to do do batch sequence BLAST search.
Hello!, I'm trying to use the ModelX for my final year thesis research and all requirements for this script are satisfied and I'm using mysql server but when i run the command i get following error.
Your dnaX time has expired on 2021-Jan-31
Academic licenses: just download it again
Commercial licenses: contact us
while i have the latest version of modelx and everything recommended by the Developers. Anyone have some solution to this please
I have some files in bed and bedgraph format to analyze with IGV. My team and I tried to upload them on IGV following the IGV site's tutorias but it hasn't worked. The bedgraph files are large (5157) and we converted them to the bynary .tdf format using the IGVTools "Count" command but it hasn't worked. Only with some files we can see a single flat line on IGV screen without any information. With FilexT we can see that the files in bed and bedgraph are not damaged.
We think that the problem is the step when we select the option "Load from File" on IGV. How can we do? What can we do?
We use the IGV_2.10.3
I am using DAVID (https://david.ncifcrf.gov/home.jsp) to cluster some genes I found upregulated in my RNAseq data. I am just using the official gene symbol without any quantitative data. However, the KEGG pathway results are giving me p-values which are extremely high. It does not make any sense to me. How the p-value can be calculated without any number? Can the p-value be significant?
When observing elastic modes of proteins, one of the results files shows deformation energy plot. What is the significance of deformation energy when studying protein dynamics?
I am looking for journals that will publish newly developed tool/server/web application/pipeline that are useful in biology, or a newly curated database with biological significance.
Can anyone kindly suggest some journals that publishes Bioinformatics and Computational Biology papers that will publish -
- Bioinformatics Tools/Servers (Machine Learning, Deep Learning based or else)
- Text Mining
- Databases
- Datasets
- Pipeline etc.
I know a few such as:
- Bioinformatics
- Nucleic Acids Research
- Database
- GigaScience
- Nature Scientific Data
- Nature Computational Science
- Briefings in Bioinformatics
- BMC Bioinformatics
- PLOS Computational Biology
- Journal of Cheminformatics
If you know more, kindly suggest the journal names. Thank you in advance.
I have been working on a protein-ligand complex simulation. While I have been careful all the way in preparing the necessary files including the .top and the .gro files I have come across an error stating "2 particles communicated to PME rank 4 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x" while running the mdrun. Initial lookout into this issue gave indications of the system getting blown up. I initially tried to troubleshoot the issue by lessening the time steps as suggested in the gromacs documentation but couldn't resolve the issue. Could anybody give suggestions regarding this issue?
Thanks
How to determine which bacterial virulence factor (bacterial toxins or cell wall components) in relevance to human sepsis or bacterial infection will interact or regulate my target protein of interest. I have examined with LPS treatment in a dose and time dependent fashion. However, I did not notice any difference in expression. Are there any panel of bacterial virulence factors commercially available or bioinformatically possible?
Hi,
I am studying a simultaneous proton transfer, bond breakage and nucleophilic attack (by water molecule), using US approach for which I had already performed 5ns QM/MM simulation.
All three reactions takes places in a single step (Inversion mechanism for Glycoside hydrolase). Now, I am confused in defining the restraint variables.
I have selected 4 Reaction Coordinates:
1. RC1: Proton transfer from Base residue to leaving group
OE1-HE1 -> C----O4 (this glycosidic bond breaks and HE1 is transferred to O4 )
So, the reaction coordinate for this reaction is difference in distance between OE1-HE1 and O4--HE1.
2. RC2: Glycosidic bond breakage:
C-----O4 -> C O4. The reaction coordinate for this reaction is the distance between C and O4
3. RC3: Nucleophilic attack by water:
H(i)O(w)H(w) [this is nucleophilic water] ---- C (anomeric carbon of the broken glycosidic bond). The reaction coordinate for this reaction is the distance between C and O(w).
4. RC4: Proton transfer from water (H(i)) to Acid Residue
H(i)O(w)H(w) -- OD1 (Acid residue). The reaction coordinate for this step is difference in distance between O(w)-H(i) and OD1-H(i).
For the RC2, I have made the following restraint file:
# distance restraint
&rst iat=8122,8132 r1=0, r2=1.8, r3=1.8, r4=5, rstwt=1,-1, rk2 = 500.0, rk3 = 500.0, /
I have increased the the value for r2 & r3 by 0.2 and upto 3.4. I am not able to understand what should be the value for r1 and r4 ? Could anyone pls comment on it and explain it briefly?
I also not able to understand how to make the restraint file for difference in distances between two set of atoms, as in case of RC4 and RC1. I would be helpful for me if somebody explains it too with an example.
I also want to visualize all the four reaction steps so which trajectory files from all the four RCs I should see?
Since I am new to US, it would be a great help if somebody can guide me through this.
Regards
BHARAT
I have RNA-Seq data for different cell lines and I'm looking to find lncRNAs which maybe deferentially expressed.
Hi everyone! I'm trying to work on the acquisition of the Raman Spectra of a leaf section using Confocal Raman Spectroscopy. The samples to be used are pure, dried, and powdered leaf samples. I am going to use a 785 nm laser source.
However, the only thing I get was a spectra with no peaks or it is strongly masked by fluorescence. Do you have any tricks/sample preparations to avoid the fluorescence because I'm afraid that it covers the raman signal or enhance the Raman Signal because the compounds might have a relatively weak Raman Signal compare to the background signal and the fluorescence? Are there any sample preparations that can be done without the use of water or an immersion objective like the use of solid matrices which can be mixed with the sample? Thank you.
Hi,
We were carrying out in vacuo energy minimization studies of a protein dimer (which is experimentally proven to be a dimer). Earlier, the same work has been done in our lab using an older version of GROMACS (4.5.5) and used Group cutoff schemes with coulomb type= cutoff and with no pbc.
When we reinitiated the work again and have to use the Gromacs 5.0.4, the default cutoff scheme is changed to Verlet. We are observing that using Verlet cutoff scheme, the monomers dissociate from each other which is not the case even in this version when using Group cutoff scheme.
I searched for literatures and found out the differences are probably in the pairlist generation. In my graduate courses, I have read about energy drift in molecular dynamics simulation and is aware (though not in details) that Verlet algorithm has something to do with it.
Can anyone elucidate on this problem? The minimization runs fine and the protein remains dimerized when using Group cutoff. This happens even after solvation. We have used an xyz pbc and grid neighbour searching type with default fourier spacing and rlist as we have not mentioned the last two parameters explicitly in the mdp file.
I want to know the theory in play behind this. Please help.
I have performed RAPD for V. cholerae isolates with 1281 and 1283 random primers and found a distinct band pattern. I have attached a picture.
I am looking fora command that will modify 3 chains available in the original pdb into a single chain and then renumber all of the residues. I have tried using alter command but when I export the pdb I get only one chain (of the initial trimer) and not the merged chain
I am looking for a recent diagnosis for chikungunya virus through computational biology techniques.
Hope everyone is having a good day.
I want to learn computational biology. I have a PhD. in pharmacology. Lots of times I heard about the computational biology/bioinformatics but never had a guideline how to learn or to start this interesting field of research.
It would be very helpful if you can guide me through this.
Have a nice day.
I'd also like to know the recent data sets used in research for the above domain.
Please suggest some protein ligand docking servers to do docking online
also need some webservers that allow the multiplwe ligands at a time to doking
thanking you to all the knowledgable persons
I am curious if there exist any bioinformatics tools that can predict changes in secondary structure for proteins and/or nucleic acids. For instance, say a C-terminal loop on a protein reorganizes into a helix in response to binding to RNA. Or say an intrinsically unstructured segment of RNA forms a transient stem-loop in order to bind to a protein. Are there any computational tools that could predict such a change short of performing in-lab experiments?
Hello, im Phd student, In my master's thesis, I investigated the cytotoxic, apoptotic and cell cycle effects of an anticancer drug (Danusertib) on pancreatic cancer cells (CFPAC-1and Mia-PaCa-2) by using xCelligence and Flow cytometry in Cell culture lab.
However, I want to do my Phd thesis with virtual experiments using databases ( OMIM, COSMIC, GAD, TCGA) and computer power (maybe on Amazon web services, google cloud or azure) due to financial insufficiency and I like to spend time with computers. So I don't know where to start research about these things and can I do a logical research with these databases? Can anyone give a tip or advice ?
Besides gene essentiality and non homology with human proteins.
INVdock software has been used to predict the first receptor of drug with low molecular weight and finding or predicting cell target, I would be thankful to anybody who could let me know how could I get the software and procedure for working with the same.
How is the popularity of INVdock software?
is this software free and what is the procedure of working with that?
I'm trying to use GridMAT to get the area per lipid and thickness so I installed activeperl on my laptop (on windows) and put the three necessary files in and ran this command:
> perl GridMAT-MD.pl param_example
It doesn't give me any valuable output.
#### I attach perl screen and its error ####
Would you help me please to get the desired results?
Usually for genetic association analysis there are lots of SNPs, but we generally select few tagSNPs based on LD value (r2)? How can we calculate the r2 value to know which SNPs are to be chosen?
I have done my dockings of a ligand to a protein. I want to save protein-ligand complex as a PDB file in AutoDock so that protein viewer can see it. Or Why is it that Pymol did not see Autodock result I saved through ''write complex''.Thanks
In the fasta output of Prokka listing the name of genes, some genes does not have any name ("gene: NA"). My question is whether these genes are hypothetical or they do not have any name?
If the former one is the case, how Prokka determine them?
Hi everyone,
I need to do MD simulation of wild type and ten variants at 50 ns. I am looking for a low-cost cloud service/ simulation environment. Would you please suggest me any?
Thanks in advance.
Can anybody help me in submitting 8 Leishmania donovani partial gene sequences in genbank (ncbi)?
I have eight sequences with small differences.
The promoter region of the ldmdr1 gene of L. donovani was amplified and the PCR products were sequenced using sanger dideoxy sequencing method. Now I have eight different sequences (450-490 nucleotides long) but I don't know the "features" of those sequences as I am not very good in molecular biology. Kindly somebody help me in submitting my sequences. I shall be highly obliged and will acknowledge the person in my PhD thesis :-)
I have query file which i got from genemarkhmm tool and which is in nucleotide format. My problem is I want to run BLASTx but It gives me "No alias or index file found for protein database" error. Mt database is protein database. I struck here help me this work is very important for me.
I'm a molecular biologist, and i have a few projects coming up in transcriptomes and small RNA analysis. Can i get by without knowing any programming using user-friendly software such an Geneious Prime or another program you can suggest or is it absolutely a must?
When I load repeat simulation of my mutated protein from 10ns-20ns-30ns, its RMSD graph is in picture. I watched my dcd, my protein goes out of the water box. I tried to put it inside of the box with "pbc wrap -centersel "protein" -center com -compound residue -all" code. The protein entered the box but the RMSD values doesn't change. How can I solve this problem?
Hello,
I trimmed and assembled (Reverse and Forward) my sequences using CLC Main Workbench software. The trimming I did aims to remove poor quality 3’ and 5’.
I’m going to perform multiple alignment, haplotype diversity and phylogenetic tree, using other software (DnaSP, Arlquin and MEGA).
Those softwares need uniform size of sequences (same length).
I have about 120 COI sequences of different sizes (640-750bp). The expected size was 710 bp.
Is there any software or method to trim sequences at same length: either one by one or both together? Which size do I need to choose for this trimming in my case?
Hi to all,
I'm approaching to the haddock web-tool for the first time. I got the username and password for the easy interface.
I'd like to know wheather i'm on the right way.
Once I've uploaded the pdb files to be docked, I have to specify both the active and the passive residues.
In order to determine the active residues I have performed an NMR titration of the unlabelled protein with the labelled ligand and vice versa. Then I've calculated the chemical shift perturbation.
Now I have to determine which among them are the active residues in the protein-ligand interaction.
So, shall I have to submit the pdb to a SASA (solvent accessible surface area) calculation program and chose the chemical shift perturbation residues that match with those solvent accessible by the SASA program?
is it correct?
do you advise any software/webtool? (i know NACCESS, but there is a very tedious procedure that i have to follow in order to get codes for decrypt the rar files)
thank you.
what have i do for the passive residues, is reliable the option on haddock that allows to determine them automatically?
Bye
Can you suggest some of the free journals in the field of bioinformatics, computational Biology, that provide green open access. If any one can recommend a Free-To-Publish journal with relevant scope, will be greatly appreciated!.
I downloaded a database from Binding DB. It contains a lot of duplicate structures. How can I remove these duplicate structures?
Suppose i have a DNA sequence and i want to find transcription strat site, CDS, poly A signal etc., which software will be useful to find this out?
In the context of mapping next generation sequence reads (of RNA->cDNA) to a reference genome to estimate allele specific (AS) expression:
Allelic imballence (AI: more reads mapping to one allele than another) can be due to a variety of technical and biological factors, so it is important to control for causes of AI that are not biological if you want to estimate AS expression. There are several strategies that have been developed to try and address these problems, including read masking and genomic blacklists.
What is the difference between read masking and genomic blacklists?
Thanks!!!
Hi everyone!
I've been study the expression of 5 microRNAs using TaqMan assays in several cell lines (microRNAs were selected through literature review) and I obtained statistically significant results in spite of the results were not the expected.
Is it possible now to perform gene ontology studies and construct networks with the cytoscape to better understand the role of these microRNAs in my samples without a previously global microRNA expression analysis?
Few of us wanted to create a discord server for Biophysics. What we intend is to begin a commonplace for discussions/numerical experiments. Also possibly document the results in the form of blogs or other media.
I believe that there are many biophysics/computational biophysics/Molecular Dynamics enthusiasts here. Here is the server link: https://discord.gg/qRQRq2k
Come and join us. Let us learn together.
I was trying to design a nanocluster of 10 nm diameter using Material studio(MS) software. Due to the lack of the "nm" size option in the material studio, I have used a 50Å (angstrom) radius option available in MS to construct the nanocluster. I was confused when I found 60000 atoms in the generated nanocluster of size 10 nm diameter (100Å). Whether my conversion (nm to Å ) is correct or the Å mentioned in MS is different from my conversion?. I have doubt that a 10 nm-sized nanoparticle will have 60000 atoms in its cluster form. Please help me with this issue.
Thanking you
Hari Prasath
I actually have two queries. Biochemical studies suggested presence of an enzyme in an organism but the gene encoding the enzyme is not known. I would like to find out gene candidates based on homology search/sequence similarity, using sequences of similar enzymes present in other organisms. My questions are:
1. What points should I consider to select an already known gene/protein for use in homology search
2. To find out the orthologue of the known gene/prtoein by bioinformatics, which database/software should I use?
Please suggest papers/websites/softwares for beginners.
Thanks!
We have dataset containing large numbers of proteins belonging to PDB, SwissProt, TargetDB and unknown sources. We have no idea about their nucleotide sequences, but we are interested in understanding codon preference in these proteins. I would highly appreciate it if you please advise me on how to extract original nucleotide sequences of these proteins.
I have 14 miRNA that is related to a particular disease. I want to draw a network like Gene Networking (GeneMania).I can draw a network easily by inputting the gene name in genemania but which softwere can take input miRNA name like this? Which software is better? I was trying to use Cytoscape but it require pre-networking data (if I am not wrong). I am not sure whether I can get any pre-networking data for miRNA. Some of the miRNA is quite new and some old version of the software can't recognize that one.
Please help me how I can get a network like Genemania. I only can input different miRNA name and particular disease. Thanks.
I am interested in doing analysis on a set of differentially expressed genes. We use to have access to Ingenuity, where I discovered the "Upstream Regulator" tool, identifying upstream regulatory proteins enriched for your dataset.
Since we no longer have access to Ingenuity, I have been trying to find an alternative, preferably free.
I am not looking for pathway analysis tools, but specifically for these upstream regulator-tool.
Thank you,
I am working on FIV and have sequenced each gene individually and retrieved multiple complete genomes and individual genes from Genbank. I am trying to align all the different genes to the complete genome. I need a program that can align each individual sequence to one reference sequence. I have tried MEGA but from what I understand the pairwise alignment compares 2 subsequent sequences in the list and tries to align them then the next pair and so forth (this won't work as I don't want to align different genes), while the multiple sequence alignment tries to align each sequence to all the other sequences (this also won't work for the same reason). Is there a setting in MEGA that will allow me to set a reference sequences to which it must compare all other sequences to or another program that will allow me to do this.
The only other option I can think of is to compare each sequence individually to the reference sequence and that will be a nightmare as I have 4289 sequences. Please if you have any suggestions let me know. Thank you.
Which book should I read to understand bioinformatics from the very beginning?
Hi everyone, currently I' m doing a course "Whole-genome sequencing and its applications" from the technical university of Denmark, working on a final project
So I have five unknown samples,
And those five unknown samples, they are in the sample genomes.
That means they are in FASTA format, not FASTQ or RARI.
They already assembled.
And in those five genomes that will get,
three of them are the outbreak strains.
So, what I have to find out from the five strains, I have to find out which one are the three outbreak strains. So in order to identify which one are the outbreak strains.
Of course, because they're unknown, I have to know what they are.
Then I have to know how to treat them.
To know those questions, they give me some hint.
For example, what they are?
I can use KmerFinder to know a species and once I know species I also can know sequence type by using MLST tool.
And then I can see if my samples contain any plasmid using PlasmidFinder.
And if my sample contains plasmid and what kind of plasmid in my sample,
I can do Pm(t) or plasma typing.
I am trying to run SANDER in parallel in cluster. In Gromacs, I used to run
dplace -c 0-59 mdrun -v -s em.tpr -c em.pdb -nt 60 (here I was defining CPU numbers 0-60, and -nt was assigning the number of nodes).
For Sander in Amber, I tried
mpirun -np 10 sander -O -i min.in -o min.out -p test-solv.prmtop -c test-solv.inpcrd -r min.rst -ref test.inpcrd
I too also have tried
mpirun -np 10 pmemd -O -i heat.in -o heat.out -p test-solv.prmtop -c min.rst -r heat.rst -x heat.mdcrd -ref min.rst -inf heat.mdinfo &
However, it is running only in one cpu. I too have tried using -nt 10 or -t 10 etc. This is firing errors.
In this regard, can somebody help me in finding the correct command that can assign 10 CPU to calculate MD using Sander. I have referred mpirun from Amber FAQ and group discussions, but these are not working in my cluster.