
research-software-development
by ZelongGuo
:gear: My MBP configurations.
SKILL.md
name: research-software-development description: Skill for research-oriented scientific software development, balancing reproducibility, correctness, and long-term maintainability, with selective engineering rigor.
Research Software Development Skill
This skill provides guidance for research-driven software development, where the primary goals are scientific correctness, reproducibility, and clarity, while applying engineering rigor selectively when software evolves toward long-lived tools, shared codebases, or community-facing packages.
It draws inspiration from Clean Architecture and Domain-Driven Design (DDD), but adapts them to the realities of scientific computing, numerical modeling, and exploratory research workflows (e.g., InSAR/GNSS processing, geophysical modeling, and data analysis pipelines).
Guiding Philosophy
- Science first, engineering second: prioritize correctness, traceability, and interpretability over premature abstraction.
- Reproducibility over cleverness: code should make experiments easy to reproduce and audit.
- Progressive rigor: notebooks and scripts are acceptable early; architectural structure increases as code matures.
- Models are research assets: algorithms, physical models, and inversion logic are more valuable than infrastructure glue.
Code Style Rules
General Principles
-
Early return pattern: prefer early returns over deeply nested conditionals to improve readability.
-
Avoid copy-paste logic in scientific workflows; extract reusable functions for (for example):
- forward models
- kernels / Green's functions
- misfit and regularization terms
-
Functions exceeding 80 lines should be decomposed when possible.
-
Files exceeding 200 lines should be split only if it improves conceptual clarity (not just to satisfy style).
-
Prefer pure functions for scientific computation (inputs → outputs, no hidden state).
Numerical and Scientific Code Practices
- Make units, coordinate systems, and conventions explicit in names and documentation.
- Avoid magic numbers; define physical constants and hyperparameters clearly.
- Favor readability over micro-optimizations unless the code is on a proven performance-critical path.
- Numerical stability and physical interpretability take precedence over abstraction purity.
Library-First (but Research-Aware) Approach
Preferred Strategy
-
ALWAYS check existing scientific libraries before writing custom implementations:
- numerical linear algebra (e.g., BLAS/LAPACK wrappers)
- optimization and inversion libraries
- geospatial and remote-sensing toolkits
-
Reuse well-tested libraries for:
- IO, file formats, coordinate transforms
- plotting and visualization
- parallelization and acceleration
When Custom Code Is Justified
Custom implementations are appropriate when:
- The logic encodes domain-specific physical models (e.g., fault slip, viscoelastic relaxation).
- Existing libraries do not support required assumptions or geometries.
- The code represents a research contribution rather than infrastructure.
- Full transparency is required for peer review and reproducibility.
- Performance-critical kernels require tailored optimization.
Scientific novelty is a valid reason to write custom code.
Architecture and Structure
Research-Oriented Clean Architecture
-
Keep scientific core logic independent of:
- plotting frameworks
- file system layout
- command-line interfaces
-
Separate clearly:
- physical / mathematical models
- numerical solvers
- data preparation and visualization
Typical layering (conceptual, not rigid):
- Domain layer: equations, physical assumptions, forward/inverse models
- Application layer: experiment setup, parameter sweeps, inversion workflows
- Infrastructure layer: IO, plotting, parallel execution, external tools
Naming Conventions (Critical for Research Code)
-
AVOID vague names:
utils,helpers,misc,test2 -
USE names that encode scientific meaning:
TriangularDislocationKernelAfterSlipInversionViscoelasticRelaxationModel
-
Prefer names that would still make sense 5 years later or to another researcher.
Separation of Concerns (Applied Pragmatically)
- Do NOT mix physical equations with plotting logic.
- Do NOT embed hard-coded experiment parameters inside core model functions.
- Keep inversion configuration separate from inversion algorithms.
- Allow notebooks/scripts to orchestrate experiments, but keep them thin.
Scripts orchestrate; libraries explain.
Anti-Patterns to Avoid in Research Software
-
Re-implementing standard numerical methods without justification.
-
Over-engineering early-stage exploratory code.
-
Monolithic scripts that mix:
- data loading
- modeling
- inversion
- visualization
-
"One-off" hacks silently becoming production research code.
Remember:
Undocumented scientific code is irreproducible science.
Code Quality and Reproducibility
-
Ensure deterministic behavior where possible (random seeds, fixed solvers).
-
Use clear error messages for invalid physical assumptions or parameter ranges.
-
Keep functions under 50 lines when feasible, but allow exceptions for mathematically cohesive blocks.
-
Prefer explicit over implicit behavior, even if slightly verbose.
-
Minimal but meaningful documentation:
- what problem is solved
- what assumptions are made
- what each parameter represents
Transition to Engineering-Grade Software
When research code evolves into:
- shared group tools
- open-source packages
- long-term modeling frameworks
then progressively introduce:
- stronger module boundaries
- stricter testing
- clearer public APIs
- more conventional Clean Architecture discipline
Core Principle
In research software, clarity and correctness define quality; architecture exists to preserve them as complexity grows.
スコア
総合スコア
リポジトリの品質指標に基づく評価
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
3ヶ月以内に更新
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
レビュー
レビュー機能は近日公開予定です


