Back to list
linxule

interview-ingest

by linxule

Epistemic partnership infrastructure for AI-assisted qualitative research. Claude Code plugin with 3-stage methodology, 4 specialized agents, and 11 skills.

1🍴 0📅 Jan 22, 2026

SKILL.md


name: interview-ingest description: "This skill should be used when users have audio interview recordings to transcribe, need to convert PDF documents, mentions 'import data', 'transcribe', 'convert', or is starting data preparation for Stage 1 or Stage 2."

interview-ingest

Audio transcription and document conversion for qualitative data import. Converts interview recordings, PDFs, and other formats into analyzable markdown.

When to Use

Use this skill when:

  • User has audio interview recordings to transcribe
  • User needs to convert PDF documents
  • User mentions "import data", "transcribe", "convert"
  • Starting data preparation for Stage 1 or Stage 2

MCP Dependencies

This skill operates at three capability tiers:

Tier 1: Best (Requires MinerU API key)

  • PDFs: MinerU VLM-powered parsing (90%+ accuracy)
  • Tables/Images: Excellent extraction
  • Audio: Falls back to Markdownify
  • Best for: Complex academic papers, documents with tables/figures

Tier 2: Good (Bundled - no API key)

  • PDFs: Markdownify conversion
  • Audio: Markdownify transcription
  • Tables/Images: Basic extraction
  • Best for: Simple documents, interview recordings

Tier 3: Basic (Fallback)

  • PDFs: Manual copy/paste or OCR
  • Audio: External transcription service
  • Guidance provided for manual workflow

Checking Tier Availability

# Check for MinerU
[ -n "$MINERU_API_KEY" ] && echo "MinerU available (Tier 1)"

# Markdownify is always available (bundled)
echo "Markdownify available (Tier 2)"

Workflow by Format

Audio Interviews

Tier 1/2 (Markdownify):

# Transcribe audio file
markdownify audio-to-markdown /path/to/interview.mp3

# Output: interview.md with transcript

Best practices:

  • Use high-quality recordings when possible
  • Review transcripts for accuracy
  • Add speaker labels if not auto-detected
  • Note timestamps for key passages

PDF Documents

Tier 1 (MinerU - recommended for complex PDFs):

# Parse PDF with VLM mode for tables/images
mineru_parse({
  url: "file:///path/to/paper.pdf",
  model: "vlm",
  formula: true,
  table: true
})

Tier 2 (Markdownify):

# Convert PDF to markdown
markdownify pdf-to-markdown /path/to/paper.pdf

Other Formats

FormatToolNotes
DOCXMarkdownifyGood conversion
PPTXMarkdownifyExtracts text + images
XLSXMarkdownifyTables preserved
ImagesMarkdownifyOCR + metadata
YouTubeMarkdownifyCaptions/transcript
Web pagesMarkdownify or JinaFull content

Scripts

process-audio.js

Batch process interview recordings.

node skills/interview-ingest/scripts/process-audio.js \
  --project-path /path/to/project \
  --input-dir /path/to/recordings \
  --output-dir stage1-foundation/manual-codes

Output Organization

stage1-foundation/
├── manual-codes/
│   ├── P001-interview.md    # Transcribed interviews
│   ├── P002-interview.md
│   └── ...
├── raw-data/                 # Original files (optional)
│   ├── P001-recording.mp3
│   └── ...
└── data-inventory.json       # Tracks all data sources

data-inventory.json

{
  "documents": [
    {
      "id": "P001",
      "original_file": "P001-recording.mp3",
      "converted_file": "P001-interview.md",
      "format": "audio",
      "conversion_tool": "markdownify",
      "conversion_date": "2025-01-15",
      "duration_minutes": 45,
      "notes": "Good audio quality"
    }
  ]
}

Quality Considerations

Audio Transcription

  • Review all transcripts - AI transcription has errors
  • Add speaker labels - "Interviewer:" and "Participant:"
  • Note unclear passages - Mark with [unclear] or [inaudible]
  • Include timestamps - For later reference to original

PDF Conversion

  • Check table accuracy - Complex tables may need manual fixes
  • Verify figures - May need manual description
  • Review formatting - Headers, lists, emphasis

Integration with Stages

Stage 1 Preparation

  1. Transcribe/convert all data sources
  2. Organize in stage1-foundation/
  3. Create data-inventory.json
  4. Begin manual coding on converted files

Stage 2 Processing

  1. @dialogical-coder works with markdown files
  2. Quotes reference line numbers in converted files
  3. Audit trail links back to original sources

Fallback Guidance

If automated transcription unavailable:

Audio Options:

  • Otter.ai - Good transcription service
  • Rev.com - Professional transcription
  • YouTube auto-captions - Upload as unlisted video
  • Manual transcription - Time-intensive but accurate

PDF Options:

  • Adobe Acrobat - Export to Word/text
  • Google Docs - Open PDF, auto-OCR
  • Manual copy/paste - For short documents
  • MCPs: MinerU (optional), Markdownify (bundled)
  • Skills: document-conversion for detailed PDF handling
  • Commands: Data import commands

Score

Total Score

75/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

+10
人気

GitHub Stars 100以上

0/15
最近の活動

3ヶ月以内に更新がある

0/10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon