Skip to content

3068 decouple log configuration from apifylog#3399

Draft
l2ysho wants to merge 2 commits intov4from
3068-decouple-log-configuration-from-apifylog
Draft

3068 decouple log configuration from apifylog#3399
l2ysho wants to merge 2 commits intov4from
3068-decouple-log-configuration-from-apifylog

Conversation

@l2ysho
Copy link
Contributor

@l2ysho l2ysho commented Feb 9, 2026

idea is to create and export CheerioCrawler which can be implemented by user

  • user now has to implement all functions, not sure if there is easy way around this
  • I did some basic tests 🟢

example usage (using winston)

index.ts

import { CheerioCrawler, Configuration } from "crawlee";
import type { CrawleeLogger, CrawleeLoggerOptions } from "crawlee";
import winston from "winston";

// Create Winston logger
const winstonLogger = winston.createLogger({
  level: "debug",
  format: winston.format.combine(
    winston.format.colorize(),
    winston.format.timestamp(),
    winston.format.printf(({ level, message, timestamp, ...meta }) => {
      const prefix = meta.prefix ? `[${meta.prefix}] ` : "";
      const dataStr =
        Object.keys(meta).length > 1 ? ` ${JSON.stringify(meta)}` : "";
      return `${timestamp} [WINSTON_CUSTOM_LOGGER] ${level}: ${prefix}${message}${dataStr}`;
    }),
  ),
  transports: [
    new winston.transports.Console(),
    new winston.transports.File({ filename: "log/crawlee.log" }),
  ],
});

// Adapter to make Winston compatible with CrawleeLogger interface
class WinstonCrawleeAdapter implements CrawleeLogger {
  private logger: winston.Logger;
  private options: CrawleeLoggerOptions = {};
  private level: number = 4; // INFO
  private warningsLogged = new Set<string>();

  constructor(logger: winston.Logger, options: CrawleeLoggerOptions = {}) {
    this.logger = logger;
    this.options = options;
  }

  getLevel(): number {
    return this.level;
  }

  setLevel(level: number): void {
    this.level = level;
    // Map Crawlee levels to Winston levels
    const levelMap: Record<number, string> = {
      0: "silent", // OFF
      1: "error", // ERROR
      2: "warn", // SOFT_FAIL
      3: "warn", // WARNING
      4: "info", // INFO
      5: "debug", // DEBUG
      6: "debug", // PERF
    };
    this.logger.level = levelMap[level] || "info";
  }

  getOptions(): CrawleeLoggerOptions {
    return this.options;
  }

  setOptions(options: Partial<CrawleeLoggerOptions>): void {
    this.options = { ...this.options, ...options };
  }

  child(options: Partial<CrawleeLoggerOptions>): CrawleeLogger {
    const childLogger = this.logger.child({ prefix: options.prefix });
    return new WinstonCrawleeAdapter(childLogger, {
      ...this.options,
      ...options,
    });
  }

  error(message: string, data?: Record<string, any> | null): void {
    this.logger.error(message, { ...data, prefix: this.options.prefix });
  }

  exception(
    exception: Error,
    message: string,
    data?: Record<string, any> | null,
  ): void {
    this.logger.error(`${message}: ${exception.message}`, {
      ...data,
      prefix: this.options.prefix,
      stack: exception.stack,
    });
  }

  softFail(message: string, data?: Record<string, any> | null): void {
    this.logger.warn(message, { ...data, prefix: this.options.prefix });
  }

  warning(message: string, data?: Record<string, any> | null): void {
    this.logger.warn(message, { ...data, prefix: this.options.prefix });
  }

  warningOnce(message: string): void {
    if (!this.warningsLogged.has(message)) {
      this.warningsLogged.add(message);
      this.logger.warn(message, { prefix: this.options.prefix });
    }
  }

  info(message: string, data?: Record<string, any> | null): void {
    this.logger.info(message, { ...data, prefix: this.options.prefix });
  }

  debug(message: string, data?: Record<string, any> | null): void {
    this.logger.debug(message, { ...data, prefix: this.options.prefix });
  }

  perf(message: string, data?: Record<string, any> | null): void {
    this.logger.debug(`[PERF] ${message}`, {
      ...data,
      prefix: this.options.prefix,
    });
  }

  deprecated(message: string): void {
    this.warningOnce(`[DEPRECATED] ${message}`);
  }

  internal(level: number, message: string, data?: any, exception?: any): void {
    const levelMap: Record<number, keyof winston.Logger> = {
      1: "error",
      2: "warn",
      3: "warn",
      4: "info",
      5: "debug",
      6: "debug",
    };
    const winstonLevel = levelMap[level] || "info";
    (this.logger as any)[winstonLevel](message, {
      ...data,
      prefix: this.options.prefix,
      exception,
    });
  }
}

// Create the adapter
const crawleeLogger = new WinstonCrawleeAdapter(winstonLogger);

// Configure Crawlee to use Winston
const config = new Configuration({
  loggerProvider: crawleeLogger,
  logLevel: 5, // DEBUG
});

// Create crawler with custom configuration
const crawler = new CheerioCrawler(
  {
    async requestHandler({ request, $, log }) {
      log.info(`Processing ${request.url}`);
      const title = $("title").text();
      log.debug("Page title extracted", { title });
      console.log(`Title: ${title}`);
    },
  },
  config,
);

// Run the crawler
(async () => {
  console.log("Starting crawler with Winston logger...\n");
  await crawler.run(["https://crawlee.dev"]);
  console.log("\nCrawler finished!");
})();

@l2ysho l2ysho self-assigned this Feb 9, 2026
Comment on lines +56 to +101
error(message: string, data?: Record<string, any> | null): void;

/**
* Logs an `ERROR` level message with a nicely formatted exception.
*/
exception(exception: Error, message: string, data?: Record<string, any> | null): void;

/**
* Logs a `SOFT_FAIL` level message.
*/
softFail(message: string, data?: Record<string, any> | null): void;

/**
* Logs a `WARNING` level message.
*/
warning(message: string, data?: Record<string, any> | null): void;

/**
* Logs a `WARNING` level message only once.
*/
warningOnce(message: string): void;

/**
* Logs an `INFO` message.
*/
info(message: string, data?: Record<string, any> | null): void;

/**
* Logs a `DEBUG` message.
*/
debug(message: string, data?: Record<string, any> | null): void;

/**
* Logs a `PERF` level message for performance tracking.
*/
perf(message: string, data?: Record<string, any> | null): void;

/**
* Logs given message only once as WARNING for deprecated features.
*/
deprecated(message: string): void;

/**
* Internal logging method used by some Crawlee internals.
*/
internal(level: number, message: string, data?: any, exception?: any): void;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be more practical to require a single log(message, level, once) method in the interface and provide these helper methods through some wrapper.

Feel free to tweak stuff to a reasonable degree, we can affor BC breaks. We don't really need to keep @apify/log directly assignable to CrawlingContext.log. It's no problem if that's going to require some additional wrapper or even adding a new method to @apify/log.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants