Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We could maybe chose the target window as the screenshot capture source instead of the full screen to prevent it to be hidden buy the Agent:

``` const getScreenshot = async (windowTitle: string) => { const { width, height } = getScreenDimensions(); const aiDimensions = getAiScaledScreenDimensions();

  const sources = await desktopCapturer.getSources({
    types: ['window'],
    thumbnailSize: { width, height },
  });

  const targetWindow = sources.find(source => source.name === windowTitle);

  if (targetWindow) {
    const screenshot = targetWindow.thumbnail;
    // Resize the screenshot to AI dimensions
    const resizedScreenshot = screenshot.resize(aiDimensions);
    // Convert the resized screenshot to a base64-encoded PNG
    const base64Image = resizedScreenshot.toPNG().toString('base64');
    return base64Image;
  }
  throw new Error(`Window with title "${windowTitle}" not found`);
}; ```


Yup that could help, although if the key content is behind the window, clicks would bug out. I'm writing a PR to hide the window for now as a simple solution.

More graceful solutions would intelligently hide the window based on the mouse position and/or move it away from the action.


I think you can use nut-js desktop automation tool to send commands straight to the target window

```

import { mouse, Window, Point, Region } from '@nut-tree-fork/nut-js';

async function clickLinkInWindow(windowTitle: string, linkCoordinates: { x: number, y: number }) {

try {

    // Find window by title (using regex)
    const windows = await Window.getWindows(new RegExp(windowTitle));
    if (windows.length === 0) {
      throw new Error(`No window found matching title: ${windowTitle}`);
    }
    const targetWindow = windows[0];

    // Get window position and dimensions
    const windowRegion = await targetWindow.getRegion();
    console.log('Window region:', windowRegion);

    // Focus the window
    await targetWindow.focus();

    // Calculate absolute coordinates relative to window position
    const clickPoint = new Point(
      windowRegion.left + linkCoordinates.x,
      windowRegion.top + linkCoordinates.y
    );

    // Move mouse to target and click
    await mouse.setPosition(clickPoint);
    await mouse.leftClick();

    return true;
  } catch (error) {
    console.error('Error clicking link:', error);
    throw error;
  }
}

```


Maybe instead of a floating window do it like Zoom does when you're sharing your screen, become a frame around the desktop with a little toolbar at the top, bonus points if you can give Claude an avatar in a PiP window that talks you through what it's doing




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: