Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
17 commits
Select commit Hold shift + click to select a range
83e8193
feat: Add editable transcript with speech recognition functionality
SATVIKsynopsis Aug 6, 2025
2fa2c83
feat: Add speechToText utility and deployment summary
SATVIKsynopsis Aug 6, 2025
565f8af
feat: Add editable transcript with speech recognition functionality
SATVIKsynopsis Aug 6, 2025
c78e9a9
Merge branch 'feature/editable-transcript-speech-recognition'
SATVIKsynopsis Aug 6, 2025
c2b190c
feat: Add speechToText utility with Whisper and Browser Speech API su…
SATVIKsynopsis Aug 7, 2025
3c7de30
feat: Add speechToText utility with Whisper and Browser Speech API su…
SATVIKsynopsis Aug 7, 2025
217bcc8
Merge branch 'master' into feature/speech-to-text-utility
SATVIKsynopsis Aug 7, 2025
4544c93
Merge remote-tracking branch 'origin/feature/speech-to-text-utility'
SATVIKsynopsis Aug 7, 2025
fb881e6
Merge branch 'feature/speech-to-text-utility'
SATVIKsynopsis Aug 7, 2025
94199b6
Merge branch 'master' into feature/speech-to-text-utility
SATVIKsynopsis Aug 7, 2025
175ff87
Merge branch 'feature/speech-to-text-utility'
SATVIKsynopsis Aug 7, 2025
a69da4e
Merge branch 'master' into feature/speech-to-text-utility
SATVIKsynopsis Aug 7, 2025
09a51e6
feat: Add speechToText utility and deployment summary
SATVIKsynopsis Aug 6, 2025
d0d762f
feat: Add speechToText utility with Whisper and Browser Speech API su…
SATVIKsynopsis Aug 7, 2025
410a725
Merge branch 'feature/speech-to-text-utility'
SATVIKsynopsis Aug 7, 2025
c97d7c5
Merge branch 'master' into feature/speech-to-text-utility
SATVIKsynopsis Aug 7, 2025
b5288a9
feat: Complete Editable Transcript System with Speech Recognition
SATVIKsynopsis Aug 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions DEPLOYMENT_SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Deployment Summary - Editable Transcript Feature

## 🚀 Successfully Pushed to GitHub

**Repository**: https://github.com/SATVIKsynopsis/Blabber
**Branch**: `feature/editable-transcript-speech-recognition`
**Pull Request URL**: https://github.com/SATVIKsynopsis/Blabber/pull/new/feature/editable-transcript-speech-recognition

## ✅ Features Implemented

### Core Functionality
- **Editable Transcript System**: Click segments to edit text inline
- **Audio Sync**: Click transcript segments to jump to corresponding audio timestamps
- **Real-time Speech Recognition**: Browser Web Speech API integration (Chrome/Edge)
- **Data Persistence**: localStorage for transcript storage across page refreshes
- **Export Options**: Download transcripts in TXT and SRT formats

### Components Created
- `EditableTranscript.jsx` - Main transcript editor with audio sync
- `TranscriptSegment.jsx` - Individual segment editing component
- `AudioRecorder.jsx` - Audio recording functionality
- `AudioPost.jsx` - Audio post display component
- `LiveSpeechRecognition.jsx` - Real-time speech recognition modal
- `TranscriptDemo.jsx` - Demo page with three modes

### Custom Hooks
- `useSpeechToText.jsx` - Web Speech API integration
- `useTranscript.jsx` - Transcript management and persistence

### Backend Updates
- Updated post models to support transcript data
- Added transcript routes and controllers
- **Environment Variables**: Replaced all hardcoded values with `process.env`

### Environment Variables Configured
```
MONGO_URI=mongodb+srv://...
JWT_SECRET=your_jwt_secret_here
CLOUDINARY_CLOUD_NAME=your_cloud_name
CLOUDINARY_API_KEY=your_api_key
CLOUDINARY_API_SECRET=your_api_secret
PORT=5000
BASE_URL=http://localhost:5000
FRONTEND_URL=http://localhost:3000
```

## 🎯 Key Improvements
1. **Cost-Effective**: Uses free Browser Web Speech API instead of paid services
2. **Real-time Processing**: Live speech recognition with interim results
3. **User-Friendly**: Intuitive editing interface with confidence indicators
4. **Persistent Data**: Recordings survive page refreshes
5. **Export Ready**: Professional transcript export formats
6. **Production Ready**: Environment variables for secure deployment

## 🔧 Technical Stack
- **Frontend**: React 19, TailwindCSS, DaisyUI, React Query
- **Backend**: Node.js, Express, MongoDB Atlas
- **Speech Recognition**: Browser Web Speech API
- **Storage**: localStorage + MongoDB
- **Audio**: Web Audio API, MediaRecorder

## 📁 Files Modified/Added
- 23 files changed
- 2,944 insertions
- 56 deletions
- 10 new components created

## 🌐 Next Steps
1. Create a Pull Request from the feature branch
2. Review and merge into main branch
3. Deploy to production with environment variables
4. Test in Chrome/Edge browsers for speech recognition

The editable transcript feature is now ready for production deployment! 🎉
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Blabber
# Blabber - Feature Branch

Blabber is a modern full-stack social media web application built using the **MERN stack**. It allows users to register, log in, post updates, follow other users, receive notifications, and maintain profiles. It uses **JWT authentication** with secure HttpOnly cookies and Cloudinary for profile image uploads.

Expand Down
Empty file.
272 changes: 272 additions & 0 deletions TRANSCRIPT_FEATURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,272 @@
# Editable Transcript Feature

This feature provides a comprehensive solution for audio transcription, editing, and playback within the Blabber application. Users can record audio, generate transcripts, and edit them with interactive playback controls.

## 🎯 Features

### Core Functionality
- **Audio Recording**: Record audio directly in the browser
- **File Upload**: Support for MP3, WAV, OGG, and MP4 files (max 50MB)
- **Interactive Playback**: Click segments to play corresponding audio
- **Real-time Editing**: Edit transcript text inline with instant saving
- **Segment Management**: Mark segments as correct, incorrect, or needing review
- **Export Options**: Export transcripts as TXT, SRT, or JSON

### User Experience
- **Visual Feedback**: Color-coded segments based on status
- **Confidence Scoring**: Shows AI confidence levels for each segment
- **Speaker Detection**: Identifies different speakers in conversations
- **Responsive Design**: Works seamlessly on desktop and mobile
- **Keyboard Shortcuts**: Quick editing with Enter/Escape keys

## 🏗️ Architecture

### Frontend Components

#### 1. EditableTranscript Component
**Location**: `frontend/src/components/common/EditableTranscript.jsx`
- Main container for transcript functionality
- Manages audio playback and transcript state
- Handles segment highlighting and navigation

#### 2. TranscriptSegment Component
**Location**: `frontend/src/components/common/TranscriptSegment.jsx`
- Individual transcript segment with editing capabilities
- Status indicators and action buttons
- Inline text editing with validation

#### 3. AudioRecorder Component
**Location**: `frontend/src/components/common/AudioRecorder.jsx`
- Audio recording with browser MediaRecorder API
- File upload with validation
- Audio preview with playback controls

#### 4. AudioPost Component
**Location**: `frontend/src/components/common/AudioPost.jsx`
- Complete workflow for creating audio posts
- Step-by-step interface (Record → Generate → Review)
- Integration with post creation system

### Backend Integration

#### 1. Post Model Updates
**Location**: `backend/models/post.model.js`
- Added `audioUrl` field for audio file storage
- Added `transcript` array with segment schema
- Support for confidence scores and editing status

#### 2. API Endpoints
**Location**: `backend/routes/post.route.js`
- `PUT /api/posts/:id/transcript` - Update transcript data
- Validation for transcript format and permissions

#### 3. Controller Methods
**Location**: `backend/controllers/post.controller.js`
- `updateTranscript()` - Handle transcript updates with validation
- User authorization checks
- Proper error handling

### Custom Hooks

#### useTranscript Hook
**Location**: `frontend/src/hooks/useTranscript.jsx`
- Centralized transcript management logic
- File validation and processing
- Export functionality for multiple formats
- Integration with React Query for data management

## 🚀 Usage Examples

### Basic Integration

```jsx
import EditableTranscript from './components/common/EditableTranscript';

function MyComponent() {
const [transcript, setTranscript] = useState([]);

const handleTranscriptUpdate = async (updatedTranscript) => {
// Save transcript to backend
await updateTranscript(updatedTranscript);
setTranscript(updatedTranscript);
};

return (
<EditableTranscript
transcript={transcript}
audioUrl="/path/to/audio.mp3"
onTranscriptUpdate={handleTranscriptUpdate}
isEditable={true}
postId="post123"
/>
);
}
```

### Recording Audio

```jsx
import AudioRecorder from './components/common/AudioRecorder';

function RecordingComponent() {
const handleAudioReady = (audioBlob, audioUrl) => {
console.log('Audio ready for processing:', { audioBlob, audioUrl });
// Process audio and generate transcript
};

return (
<AudioRecorder
onAudioReady={handleAudioReady}
isProcessing={false}
/>
);
}
```

### Using the Hook

```jsx
import useTranscript from './hooks/useTranscript';

function TranscriptManager({ postId }) {
const {
generateTranscript,
updateTranscript,
exportTranscript,
isGenerating,
isUpdating
} = useTranscript(postId);

const handleFileUpload = async (file) => {
try {
const result = await generateTranscript(file);
console.log('Generated transcript:', result);
} catch (error) {
console.error('Failed to generate transcript:', error);
}
};

return (
<div>
<input
type="file"
accept="audio/*"
onChange={(e) => handleFileUpload(e.target.files[0])}
/>
{isGenerating && <p>Generating transcript...</p>}
</div>
);
}
```

## 📊 Data Structure

### Transcript Segment Schema

```javascript
{
id: "unique-segment-id",
text: "The transcribed text for this segment",
startTime: 0.0, // Start time in seconds
endTime: 5.2, // End time in seconds
speaker: "Speaker 1", // Speaker identification
confidence: 0.95, // AI confidence (0-1)
isConfirmed: false, // User confirmed as correct
isEdited: false, // Has been edited by user
needsReview: false // Flagged for review
}
```

### Post Model Extension

```javascript
{
// ... existing post fields
audioUrl: "https://cloudinary.com/audio.mp3",
transcript: [
// Array of transcript segments
]
}
```

## 🎨 Styling & Theming

### Status Color Coding
- **Green**: Confirmed segments (user approved)
- **Blue**: Edited segments (user modified)
- **Red**: Needs review (flagged by user)
- **Gray**: Pending (not yet reviewed)

### Responsive Breakpoints
- Mobile: Single column layout
- Tablet: Optimized controls and spacing
- Desktop: Full feature set with multi-column layout

## 🔧 Configuration

### Audio Settings
```javascript
const audioConstraints = {
echoCancellation: true,
noiseSuppression: true,
sampleRate: 44100
};
```

### File Validation
```javascript
const validationRules = {
maxSize: 50 * 1024 * 1024, // 50MB
allowedTypes: ['audio/mp3', 'audio/wav', 'audio/mpeg', 'audio/ogg', 'audio/mp4']
};
```

## 🔍 Demo

Visit `/demo/transcript` to see the feature in action with:
- Interactive transcript editing
- Audio playback controls
- Export functionality
- Recording capabilities

## 🚀 Future Enhancements

### Planned Features
1. **Real-time Transcription**: Live transcription during recording
2. **Multi-language Support**: Automatic language detection
3. **Voice Commands**: Control playback with voice
4. **Collaboration**: Multiple users editing same transcript
5. **Analytics**: Transcript accuracy metrics

### Integration Opportunities
1. **AI Services**: Integration with Google Speech-to-Text, Azure Cognitive Services
2. **Cloud Storage**: Direct upload to cloud storage services
3. **Search**: Full-text search across all transcripts
4. **Accessibility**: Screen reader optimization

## 🛠️ Development

### Running the Demo
```bash
# Start the frontend development server
cd frontend
npm run dev

# Access the demo at
http://localhost:5173/demo/transcript
```

### Testing
The feature includes comprehensive error handling and validation:
- File type validation
- Size limits
- Permission checks
- Network error handling

### Performance Considerations
- Lazy loading of audio components
- Efficient segment rendering
- Optimized audio playback
- Memory cleanup for audio objects

This implementation provides a solid foundation for audio transcription features and can be extended based on specific requirements.
8 changes: 5 additions & 3 deletions backend/controllers/auth.controller.js
Original file line number Diff line number Diff line change
Expand Up @@ -102,10 +102,12 @@ export const login = async (req, res) => {

export const logout = async (req, res) => {
try {
res.cookie("jwt", "", {
const isProduction = process.env.NODE_ENV === "production";

res.cookie("jwt", "hithere", {
httpOnly: true,
sameSite: "None",
secure: true,
sameSite: isProduction ? "None" : "Lax",
secure: isProduction,
expires: new Date(0),
});
res.status(200).json({ message: "Logged out successfully" });
Expand Down
Loading