
interview-ingest
by linxule
Epistemic partnership infrastructure for AI-assisted qualitative research. Claude Code plugin with 3-stage methodology, 4 specialized agents, and 11 skills.
SKILL.md
name: interview-ingest description: "This skill should be used when users have audio interview recordings to transcribe, need to convert PDF documents, mentions 'import data', 'transcribe', 'convert', or is starting data preparation for Stage 1 or Stage 2."
interview-ingest
Audio transcription and document conversion for qualitative data import. Converts interview recordings, PDFs, and other formats into analyzable markdown.
When to Use
Use this skill when:
- User has audio interview recordings to transcribe
- User needs to convert PDF documents
- User mentions "import data", "transcribe", "convert"
- Starting data preparation for Stage 1 or Stage 2
MCP Dependencies
This skill operates at three capability tiers:
Tier 1: Best (Requires MinerU API key)
- PDFs: MinerU VLM-powered parsing (90%+ accuracy)
- Tables/Images: Excellent extraction
- Audio: Falls back to Markdownify
- Best for: Complex academic papers, documents with tables/figures
Tier 2: Good (Bundled - no API key)
- PDFs: Markdownify conversion
- Audio: Markdownify transcription
- Tables/Images: Basic extraction
- Best for: Simple documents, interview recordings
Tier 3: Basic (Fallback)
- PDFs: Manual copy/paste or OCR
- Audio: External transcription service
- Guidance provided for manual workflow
Checking Tier Availability
# Check for MinerU
[ -n "$MINERU_API_KEY" ] && echo "MinerU available (Tier 1)"
# Markdownify is always available (bundled)
echo "Markdownify available (Tier 2)"
Workflow by Format
Audio Interviews
Tier 1/2 (Markdownify):
# Transcribe audio file
markdownify audio-to-markdown /path/to/interview.mp3
# Output: interview.md with transcript
Best practices:
- Use high-quality recordings when possible
- Review transcripts for accuracy
- Add speaker labels if not auto-detected
- Note timestamps for key passages
PDF Documents
Tier 1 (MinerU - recommended for complex PDFs):
# Parse PDF with VLM mode for tables/images
mineru_parse({
url: "file:///path/to/paper.pdf",
model: "vlm",
formula: true,
table: true
})
Tier 2 (Markdownify):
# Convert PDF to markdown
markdownify pdf-to-markdown /path/to/paper.pdf
Other Formats
| Format | Tool | Notes |
|---|---|---|
| DOCX | Markdownify | Good conversion |
| PPTX | Markdownify | Extracts text + images |
| XLSX | Markdownify | Tables preserved |
| Images | Markdownify | OCR + metadata |
| YouTube | Markdownify | Captions/transcript |
| Web pages | Markdownify or Jina | Full content |
Scripts
process-audio.js
Batch process interview recordings.
node skills/interview-ingest/scripts/process-audio.js \
--project-path /path/to/project \
--input-dir /path/to/recordings \
--output-dir stage1-foundation/manual-codes
Output Organization
stage1-foundation/
├── manual-codes/
│ ├── P001-interview.md # Transcribed interviews
│ ├── P002-interview.md
│ └── ...
├── raw-data/ # Original files (optional)
│ ├── P001-recording.mp3
│ └── ...
└── data-inventory.json # Tracks all data sources
data-inventory.json
{
"documents": [
{
"id": "P001",
"original_file": "P001-recording.mp3",
"converted_file": "P001-interview.md",
"format": "audio",
"conversion_tool": "markdownify",
"conversion_date": "2025-01-15",
"duration_minutes": 45,
"notes": "Good audio quality"
}
]
}
Quality Considerations
Audio Transcription
- Review all transcripts - AI transcription has errors
- Add speaker labels - "Interviewer:" and "Participant:"
- Note unclear passages - Mark with [unclear] or [inaudible]
- Include timestamps - For later reference to original
PDF Conversion
- Check table accuracy - Complex tables may need manual fixes
- Verify figures - May need manual description
- Review formatting - Headers, lists, emphasis
Integration with Stages
Stage 1 Preparation
- Transcribe/convert all data sources
- Organize in stage1-foundation/
- Create data-inventory.json
- Begin manual coding on converted files
Stage 2 Processing
- @dialogical-coder works with markdown files
- Quotes reference line numbers in converted files
- Audit trail links back to original sources
Fallback Guidance
If automated transcription unavailable:
Audio Options:
- Otter.ai - Good transcription service
- Rev.com - Professional transcription
- YouTube auto-captions - Upload as unlisted video
- Manual transcription - Time-intensive but accurate
PDF Options:
- Adobe Acrobat - Export to Word/text
- Google Docs - Open PDF, auto-OCR
- Manual copy/paste - For short documents
Related
- MCPs: MinerU (optional), Markdownify (bundled)
- Skills: document-conversion for detailed PDF handling
- Commands: Data import commands
Score
Total Score
Based on repository quality metrics
SKILL.mdファイルが含まれている
ライセンスが設定されている
100文字以上の説明がある
GitHub Stars 100以上
3ヶ月以内に更新がある
10回以上フォークされている
オープンIssueが50未満
プログラミング言語が設定されている
1つ以上のタグが設定されている
Reviews
Reviews coming soon

