-
Notifications
You must be signed in to change notification settings - Fork 0
Add cohesive codebase analysis tools #119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
# Motivation The **Codegen on OSS** package provides a pipeline that: - **Collects repository URLs** from different sources (e.g., CSV files or GitHub searches). - **Parses repositories** using the codegen tool. - **Profiles performance** and logs metrics for each parsing run. - **Logs errors** to help pinpoint parsing failures or performance bottlenecks. <!-- Why is this change necessary? --> # Content <!-- Please include a summary of the change --> see [codegen-on-oss/README.md](https://github.com/codegen-sh/codegen-sdk/blob/acfe3dc07b65670af33b977fa1e7bc8627fd714e/codegen-on-oss/README.md) # Testing <!-- How was the change tested? --> `uv run modal run modal_run.py` No unit tests yet 😿 # Please check the following before marking your PR as ready for review - [ ] I have added tests for my changes - [x] I have updated the documentation or added new documentation as needed
Original commit by Tawsif Kamal: Revert "Revert "Adding Schema for Tool Outputs"" (codegen-sh#894) Reverts codegen-sh#892 --------- Co-authored-by: Rushil Patel <rpatel@codegen.com> Co-authored-by: rushilpatel0 <171610820+rushilpatel0@users.noreply.github.com>
Original commit by Ellen Agarwal: fix: Workaround for relace not adding newlines (codegen-sh#907)
Reviewer's GuideThis PR consolidates and extends existing codebase analysis functionality into a cohesive toolkit built on the Codegen SDK by adding a comprehensive analyzer module, a context retriever utility, a unified CLI, and accompanying documentation. Class Diagram for Codebase Context RetrieverclassDiagram
direction LR
class CodebaseContext {
-codebase: Codebase
-files: List[SourceFile]
-functions: List[Function]
-classes: List[Class]
-imports: List[Import]
-_function_call_graph: Optional[Dict]
-_import_graph: Optional[Dict]
-_symbol_usage_map: Optional[Dict]
+__init__(codebase: Codebase)
+get_codebase_summary() Dict[str, Any]
+get_unused_functions() List[Function]
+get_unused_imports() List[Import]
+get_functions_with_unused_parameters() List[Tuple[Function, List[str]]]
+get_parameter_mismatches() List[Tuple[Function, List[str]]]
+get_function_call_graph() Dict[str, List[str]]
+get_import_graph() Dict[str, List[str]]
+get_circular_imports() List[Tuple[str, str]]
+get_symbol_usage_map() Dict[str, List[str]]
+get_recursive_functions() List[Function]
+get_complex_functions(threshold: int) List[Tuple[Function, int]]
+get_file_context(file_path: str) Dict[str, Any]
+get_function_context(function_name: str) Dict[str, Any]
+get_class_context(class_name: str) Dict[str, Any]
}
CodebaseContext ..> Codebase : uses
note for Codebase "Provided by Codegen SDK"
note for CodebaseContext "Retrieves and organizes context from a Codebase object provided by the Codegen SDK."
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Join our Discord community for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
I see a check failed - I'm on it! 🫡 |
✅ Fixed the failing mypy checks in this commit. The issues were related to missing type annotations and improper null handling in the codebase analysis tools. I've added proper type annotations and null checks throughout the code to ensure type safety. 💻 View my work • React 👍 or 👎 |
I see a check failed - I'm on it! 🫡 |
✅ Fixed the failing pre-commit checks in this commit. The issues were related to ruff linting rules, including:
The PR should now pass all checks. 💻 View my work • React 👍 or 👎 |
This PR adds a cohesive set of codebase analysis tools that consolidate and improve upon the existing functionality in the
error_analyzer.py
and the analyzers directory.Changes
Added
codebase_analyzer.py
- A comprehensive analyzer that identifies code quality issues, dependency problems, and structural concerns.Added
context_retriever.py
- A utility for retrieving and organizing context from a codebase, focusing on code structure, dependencies, and relationships.Added
analyze.py
- A command-line interface for running analyses and retrieving context from codebases.Added
README_CODEBASE_ANALYSIS.md
- Documentation for the new codebase analysis tools.Features
The new tools provide the following capabilities:
Benefits
These tools offer several advantages over the existing implementation:
Usage
See the
README_CODEBASE_ANALYSIS.md
file for detailed usage instructions and examples.💻 View my work • About Codegen
Note
I'm currently writing a description for your pull request. I should be done shortly (<1 minute). Please don't edit the description field until I'm finished, or we may overwrite each other. If I find nothing to write about, I'll delete this message.
Summary by Sourcery
Introduce a cohesive suite of codebase analysis tools, including a comprehensive analyzer, context retriever, and unified CLI, to assess code quality, dependencies, structure, and provide detailed context with flexible output formats.
New Features:
Enhancements:
Documentation: