sprachText/README.md

# 🎙️ Voice-to-Text App with Whisper

This is a simple, user-friendly application that records your voice and converts it into text using OpenAI's Whisper model. Built with 💻 `Python`, 🎵 `sounddevice`, and 🤗 `Gradio`, this app is designed for local use, requiring internet access **only during the initial setup**.

---

## 🌟 Features

- 🎤 **Record Audio**: Click a button to start and stop recording your voice.
- ✂️ **Automatic Splitting**: Handles long audio files by splitting them into smaller chunks for transcription.
- 📝 **Speech-to-Text**: Transcribes your voice into text using the Whisper model.
- 🔒 **Offline Capability**: After setup, the app works entirely offline.

---

## 🚀 Getting Started

### Prerequisites
1. **Python 3.8+**
2. Install the required Python libraries:
   ```bash
   pip install torch transformers sounddevice pydub gradio
   pip install --upgrade transformers datasets[audio] accelerate  
   ```
3. **FFmpeg** (for audio processing):
   - Download and install FFmpeg from [FFmpeg Official Site](https://ffmpeg.org/download.html) or via [chocolately](https://chocolatey.org/).

4. **Possible Bugs:** If there are any problems with the GPU use:
    ```bash
    pip uninstall torch torchvision torchaudio  
    ```
    Search for the right torch version for you GPU and intall torch
    ```bash
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 
    ```
---

### 📦 Installation

1. Clone this repository:
   ```bash
   git clone https://github.com/your-username/voice-to-text-app.git
   cd voice-to-text-app
   ```

2. Run the app:
   ```bash
   python app.py
   ```

3. Open the provided link in your browser to access the web app.

---

## 🛠️ How It Works

1. **Recording**: Click the **Start Recording** button to record your voice. Click **Stop Recording** when you're done.
2. **Transcription**: Click the **Transcribe** button to convert your audio to text.
3. **Automatic Handling**: The app automatically splits audio longer than 30 seconds and transcribes it in chunks.

---

## 🌐 Internet Usage

- **Setup**: The app requires an internet connection **only during the first run** to download the Whisper model and dependencies.
- **Offline Mode**: Once the model is downloaded, the app works entirely offline, ensuring privacy and local processing. (As long as the cache is not deleted. Otherwise the model will be downloaded again.)

---

## 🎉 Example Use Case

1. Record a 20-second audio note: "Take out the trash at 6 PM."
2. Stop recording.
3. Transcribe the audio to see: `"Take out the trash at 6 PM."`

---

## 🧰 Built With

- 🤗 [Transformers](https://huggingface.co/docs/transformers): Whisper model for speech-to-text.
- 🎵 [SoundDevice](https://python-sounddevice.readthedocs.io/): Audio recording.
- ✂️ [Pydub](https://github.com/jiaaro/pydub): Audio splitting for long files.
- 🌐 [Gradio](https://gradio.app/): Interactive web interface.

---

## 🤝 Contributing

Contributions are welcome! Feel free to submit issues or pull requests to improve this project.

---

## 🛡️ License

This project is licensed under the MIT License.

---

## 🗂️ File Structure

```
.
├── app.py             # Main application script
├── requirements.txt   # List of dependencies
└── README.md          # This file
```

---

Have fun 🤗
---
add README.md 2025-01-06 18:41:40 +01:00			`# 🎙️ Voice-to-Text App with Whisper`

			This is a simple, user-friendly application that records your voice and converts it into text using OpenAI's Whisper model. Built with 💻 `Python`, 🎵 `sounddevice`, and 🤗 `Gradio`, this app is designed for local use, requiring internet access only during the initial setup.

			`---`

			`## 🌟 Features`

			`- 🎤 Record Audio: Click a button to start and stop recording your voice.`
			`- ✂️ Automatic Splitting: Handles long audio files by splitting them into smaller chunks for transcription.`
			`- 📝 Speech-to-Text: Transcribes your voice into text using the Whisper model.`
			`- 🔒 Offline Capability: After setup, the app works entirely offline.`

			`---`

			`## 🚀 Getting Started`

			`### Prerequisites`
			`1. Python 3.8+`
			`2. Install the required Python libraries:`
			```bash
			`pip install torch transformers sounddevice pydub gradio`
			`pip install --upgrade transformers datasets[audio] accelerate`
			```
			`3. FFmpeg (for audio processing):`
			`- Download and install FFmpeg from [FFmpeg Official Site](https://ffmpeg.org/download.html) or via [chocolately](https://chocolatey.org/).`

			`4. Possible Bugs: If there are any problems with the GPU use:`
			```bash
			`pip uninstall torch torchvision torchaudio`
			```
			`Search for the right torch version for you GPU and intall torch`
			```bash
			`pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124`
			```
			`---`

			`### 📦 Installation`

			`1. Clone this repository:`
			```bash
			`git clone https://github.com/your-username/voice-to-text-app.git`
			`cd voice-to-text-app`
			```

			`2. Run the app:`
			```bash
			`python app.py`
			```

			`3. Open the provided link in your browser to access the web app.`

			`---`

			`## 🛠️ How It Works`

			`1. Recording: Click the Start Recording button to record your voice. Click Stop Recording when you're done.`
			`2. Transcription: Click the Transcribe button to convert your audio to text.`
			`3. Automatic Handling: The app automatically splits audio longer than 30 seconds and transcribes it in chunks.`

			`---`

			`## 🌐 Internet Usage`

			`- Setup: The app requires an internet connection only during the first run to download the Whisper model and dependencies.`
			`- Offline Mode: Once the model is downloaded, the app works entirely offline, ensuring privacy and local processing. (As long as the cache is not deleted. Otherwise the model will be downloaded again.)`

			`---`

			`## 🎉 Example Use Case`

			`1. Record a 20-second audio note: "Take out the trash at 6 PM."`
			`2. Stop recording.`
			3. Transcribe the audio to see: `"Take out the trash at 6 PM."`

			`---`

			`## 🧰 Built With`

			`- 🤗 [Transformers](https://huggingface.co/docs/transformers): Whisper model for speech-to-text.`
			`- 🎵 [SoundDevice](https://python-sounddevice.readthedocs.io/): Audio recording.`
			`- ✂️ [Pydub](https://github.com/jiaaro/pydub): Audio splitting for long files.`
			`- 🌐 [Gradio](https://gradio.app/): Interactive web interface.`

			`---`

			`## 🤝 Contributing`

			`Contributions are welcome! Feel free to submit issues or pull requests to improve this project.`

			`---`

			`## 🛡️ License`

			`This project is licensed under the MIT License.`

			`---`

			`## 🗂️ File Structure`

			```
			`.`
			`├── app.py # Main application script`
			`├── requirements.txt # List of dependencies`
			`└── README.md # This file`
			```

			`---`

			`Have fun 🤗`
			`---`