Protein Structure Viewer
Personal Projects #Bioinformatics#Python#Data Visualization

Overview#

A single-page Dash application for exploring 3D protein structures directly in the browser. Users can drag-and-drop a local .pdb or .cif file or type a 4-character PDB ID (e.g. 1CRN, 4HHB) to fetch any entry from the RCSB Protein Data Bank on demand. The structure is then rendered with dash-bio’s Molecule3dViewer alongside a one-letter amino acid sequence strip and an optional statistics panel.

All parsing, RCSB fetching, and styling logic is centralized in shared_utils.py, keeping the main protein_dashboard.py focused on layout and callbacks.

View the project on GitHub

Key Achievements#

  • Built an end-to-end 3D protein viewer that handles both local files and RCSB-hosted structures from a single UI
  • Rendered multi-chain structures with HETATM groups (e.g. 4HHB hemoglobin with ~4,800 atoms, 4 chains, and heme cofactors)
  • Implemented six coordinated UI controls (style, color scheme, background, chain filter, residue range, HETATM toggle) wired to a shared viewer state
  • Computed a Ramachandran plot from backbone φ/ψ dihedral angles directly from the parsed structure
  • Solved a Windows install blocker by shipping a local parmed.py stub so dash-bio imports cleanly without building ParmEd through MSVC
  • Pinned all transitive dependencies explicitly to make the environment reproducible on Python 3.8–3.11

Architecture#

The Dash app in protein_dashboard.py defines the layout and callbacks; every data-touching function lives in shared_utils.py so the viewer, sequence strip, and stats panel all read from the same parsed structure. A minimal parmed.py stub at the project root satisfies dash-bio’s optional ParmEd import on Windows without pulling in the real package, since the viewer does not use ParmEd’s functionality.

TechnologyPurpose
Python 3.8–3.11Core language for the app and parsing layer
DashSingle-page app shell, layout, and reactive callbacks
dash-bioMolecule3dViewer for the interactive 3D structure view
RCSB PDB APIOn-demand structure fetching by 4-character ID
NumPyBackbone dihedral math for the Ramachandran plot
PlotlyAmino acid composition bar chart and Ramachandran scatter
parmed.py stubLocal shim so dash-bio imports cleanly on Windows

Core Features#

  • Load structures two ways: drag-and-drop a local .pdb or .cif file, or type a 4-character PDB ID and fetch directly from RCSB.
  • Visualization controls: switch between cartoon, stick, and sphere rendering; recolor by atom, residue, chain, or residue type; change the background color.
  • Ligand handling: toggle HETATM records (heme groups, cofactors, waters) on and off and render them independently of the main protein style.
  • Selection tools: filter by chain, highlight a residue range (10-50 or 10,15,20), or click any residue in the sequence view to focus it in the 3D viewer.
  • Sequence view: one-letter amino acid sequence rendered above the viewer, grouped by chain, with hover tooltips showing chain and residue number.
  • Structure stats panel (expandable): per-chain residue, atom, and HETATM counts, an amino acid composition bar chart, and a Ramachandran plot computed from the backbone dihedral angles.
  • Quick-test coverage: 1CRN (crambin, 327 atoms, 1 chain) for fast sanity checks and 4HHB (hemoglobin, ~4,800 atoms, 4 chains, heme groups) for multi-chain and HETATM behavior.
  • Dynamic amino acid breakdown based on protein:
  • Dynamic Ramashandran plots based on protein:
← Back to Projects