Skip to content

General groupby optimizations #9

@tedmiddleton

Description

@tedmiddleton
  • Would sorting the row indices make grouped_frame faster or slower?

  • strings are still quite slow in mainframe. I have to investigate whether there are more opportunities to optimize copies in series_vector as well as perhaps implement string hashing in series or frame.

  • just thinking about this, mainframe could feature a transformer a bit like frame::allow_missing() but would instead convert a std::string column into some other sort of thing that uses hashes. This would sort of mean making a hashed string class. Maybe even ref-counted? Something that will coerce easily to and from std::string?

  • Could series_vector be even more optimal in its copies/moves when dealing with primitives and pods?

  • even unordered_map seems to be a bit slow for what I'm using it for. I wonder if that could be improved?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions