feat: Add failed_files to the output of pdf converters

**Is your feature request related to a problem? Please describe.**
It would be helpful if we could access the list of failed files so we can send them to another converter, such as OCR or similar. Ideally, this new feature would work for both `PyPDFToDocument` and `PDFMinerToDocument`.

**Describe the solution you'd like**
Basically, when there is an exception, the failed files would be appended to a list, something like this:
```python
  try:
      pdf_reader = PdfReader(io.BytesIO(bytestream.data))
      text = self._default_convert(pdf_reader)
  except Exception as e:
      logger.warning(
          "Could not read {source} and convert it to Document, skipping. {error}", source=source, error=e
      )
      failed_files.append(source)  # return this list along with `documents`
      continue

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add failed_files to the output of pdf converters #9851

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: Add failed_files to the output of pdf converters #9851

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions