Hacker News new | ask | show | jobs
Claude output matching copyrighted StackOverflow code
4 points by randsp 236 days ago
I asked Claude to generate a C++ implementation of a base64 encoder, and the result was nearly identical to a version on StackOverflow. I guess that wouldn’t normally be an issue, except it turns out the StackOverflow example seems to be derived from copyrighted code.

Stackoverflow relevant comment: https://stackoverflow.com/questions/180947/base64-decode-snippet-in-c#comment137345692_13935718

Claude version:

std::string base64encode(const char* data, size_t length) { static const char* base64_chars = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" "abcdefghijklmnopqrstuvwxyz" "0123456789+/";

  std::string result;
  result.reserve(((length + 2) / 3) * 4);

  size_t i = 0;
  unsigned char char_array_3[3];
  unsigned char char_array_4[4];

  while (length--)
  {
    char_array_3[i++] = *(data++);
    if (i == 3)
    {
      char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
      char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
      char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
      char_array_4[3] = char_array_3[2] & 0x3f;

      for (i = 0; i < 4; i++)
        result += base64_chars[char_array_4[i]];
      i = 0;
    }
  }

  if (i)
  {
    for (size_t j = i; j < 3; j++)
      char_array_3[j] = '\0';

    char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
    char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
    char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);

    for (size_t j = 0; j < i + 1; j++)
      result += base64_chars[char_array_4[j]];

    while (i++ < 3)
      result += '=';
  }

  return result;
}
2 comments

Would you mind sharing the full conversation? I'm asking because there's not that much logic in this snippet, and I think that if your prompting was sufficiently specific, you could almost constrain it to give you something with this structure.
I can’t share the full conversation, sorry. But I just took a function with the same purpose that used OpenSSL and asked Claude to rewrite it so it would work without relying on OpenSSL.
And how similar was this other function?
In 2025 IP and copyright don't play a role anymore ;)