Generating an MCP server in Go

MCP is table stakes for products these days. Everything’s AI, and agents will query your data one way or another.

As we’ve built Oblique, a core principle is not having internal APIs. Anything a user is able to do in the UI should be possible through Terraform or REST - and now, MCP. We generate tons of bindings from our gRPC interface with the explicit intent that as new features are added, they make their way to all of our integration points.

Wait, isn’t MCP dead?

There’s an infinite number of ways to expose capabilities to agents. Command line tools, SQL interfaces, transpiling to TypeScript. If there exists some means of querying or modifying data, someone has experimented with it as an MCP alternative.

Regardless of the tech, the model needs to understand the capabilities exposed to it and how to exercise them. For a command line tool, this means keeping your “--help” output accurate. For MCP, this is tool descriptions and input schemas. Fundamentally if you’re providing primitives to an agent, you’re in the business of keeping the documentation for those interfaces up to date and debugging when things go wrong.

Where MCP is irreplaceable are the increasing options for autonomous agents. Your Claude Code Routines and Codex Automations of the world. Sure, you may choose to implement raw tool calls through a custom harness for your internal workflows, or shell out to a binary on your laptop. But if you want to expose data to a customer workflow, you have to speak MCP. It’s become the lowest common denominator for working with agents.

Tool descriptions and schemas

Descriptions and schemas are critical for tool discovery and informing the model how to construct tool calls. A majority of the issues we hit boiled down to incorrect plumbing of comments, or typos in our API docs. For example, we had a bug where enum fields were commented but their values weren’t.

As we covered in our previous post, we generate as much of our server tooling as possible from Protobuf. In this case, we leveraged Go’s protoreflect package to produce tool definitions for the Go MCP SDK. This means we consume a proto file:

service Oblique {
  // Fetch a user by name, or using the alias "users/me" to return the currently
  // authenticated user.
  rpc GetUser(GetUserRequest) returns (User) {
    option (google.api.http) = {get: "/api/v1/{name=users/*}"};
  }
}
message GetUserRequest {
  // Name of the user of the format "users/{id}". The alias "users/me" is also
  // supported and resolves to the currently authenticated user.
  string name = 1 [
    (google.api.field_behavior) = REQUIRED
  ];
}
// A user represents a human user in the directory.
message User {
  // Format: `users/{id}`
  string name = 1 [(google.api.field_behavior) = IDENTIFIER];
  // Primary display name of the user.
  string display_name = 2 [(google.api.field_behavior) = REQUIRED];
  // Primary email of the user.
  string email = 3 [(google.api.field_behavior) = REQUIRED];
  // ...
}

…and spit out a tool description:

{
	"annotations": {
		"readOnlyHint": true,
		"title": "GetUser"
	},
	"name": "GetUser",
	"description": "Fetch a user by name, or using the alias \"users/me\" to return the currently authenticated user.",
	"inputSchema": {
		"type": "object",
		"properties": {
			"name": {
				"type": "string",
				"description": "Name of the user of the format \"users/{id}\". The alias \"users/me\" is also supported and resolves to the currently authenticated user."
			}
		},
		"required": [
			"name"
		]
	},
	"outputSchema": {
		"type": "object",
		"description": "A user represents a human user in the directory",
		"properties": {
			"name": {
				"type": "string",
				"description": "Format: `users/{id}`"
			},
			"displayName": {
				"type": "string",
				"description": "Primary display name of the user."
			},
			"email": {
				"type": "string",
				"description": "Primary email of the user."
			}
		},
		"required": [
			"name",
			"displayName",
			"email"
		]
	}
}

We’ve consistently found that human, handwritten comments are tremendously powerful at directing models and outperform anything generated. Both in the context of an AGENTS.md file or tool descriptions. This makes some intuitive sense, since if an agent can generate a comment, it can likely derive that information anyway.

The challenge here is accessing source code comments which aren’t available in files produced by the Go proto plugin. To work around this, our generation scripts output a descriptor file and embed it in our Go server. That file is then parsed with the protodesc package to access comments as we walk the message descriptors:

//go:embed descriptor_set.binpb
var descriptor []byte

var descriptorFiles *protoregistry.Files

func init() {
	opts := protodesc.FileOptions{AllowUnresolvable: true}
	set := &descriptorpb.FileDescriptorSet{}
	if err := proto.Unmarshal(descriptor, set); err != nil {
		panic("parsing descriptor: " + err.Error())
	}
	files, err := opts.NewFiles(set)
	if err != nil {
		panic("resolving files: " + err.Error())
	}
	descriptorFiles = files
}

type direction int

const (
	input  direction = 0
	output direction = 1
)

func newTool(md protoreflect.MethodDescriptor) (*mcp.Tool, error) {
	desc, err := descriptorFiles.FindDescriptorByName(fullName)
	if err != nil {
		return "", fmt.Errorf("finding descriptor by name: %s: %v", fullName, err)
	}
	description := desc.ParentFile().SourceLocations().ByDescriptor(desc).LeadingComments

	isDestructive := strings.HasPrefix(string(md.Name()), "Delete")
	readonlyPrefixes := []string{"Get", "List", "BatchGet"}
	isReadOnly := false
	for _, p := range readonlyPrefixes {
		if isReadOnly = strings.HasPrefix(string(md.Name()), p); isReadOnly {
			break
		}
	}

	inputSchema, err := schemaForMessage(md.Input(), input)
	if err != nil {
		return nil, fmt.Errorf("generating schema for input: %v", err)
	}
	outputSchema, err := schemaForMessage(md.Output(), output)
	if err != nil {
		return nil, fmt.Errorf("generating schema for output: %v", err)
	}

	toolName := renameTools()
	return &mcp.Tool{
		Name:         string(md.Name()),
		Description:  description,
		InputSchema:  inputSchema,
		OutputSchema: outputSchema,
		Annotations: &mcp.ToolAnnotations{
			Title:           toolName,
			DestructiveHint: &isDestructive,
			ReadOnlyHint:    isReadOnly,
		},
	}, nil
}

Our schema generation also filters fields based on the context (the code was a bit big for this post). If a message is used for tool input, fields with an OUTPUT_ONLY annotation are ignored by the generator. We also remove metadata fields from our API objects that aren’t critical for the model. Consider the following result from our REST API (185 tokens):

{
    "name": "users/ekqzuevxt544jl9i",
    "createTime": "2026-05-12T20:51:28.504018Z",
    "updateTime": "2026-05-12T20:51:28.512136Z",
    "deleteTime": null,
    "displayName": "Eric Chiang",
    "email": "eric@rhombic.dev",
    "secondaryEmails": [
        "eric@obliquesecurity.com",
        "eric@oblique.security"
    ],
    "title": "Software Engineer",
    "manager": "users/sx1qezek79nqyrb0",
    "pictureUri": "https://avatars.githubusercontent.com/u/2342749",
    "directReportCount": 2,
    "totalReportCount": 2
}

On output we use protoreflect to clear common fields (“createTime”, “updateTime”), as well as message-specific fields that aren’t as relevant to an MCP client. In this case our user object consumes half as many tokens (97) as its API equivalent:

{
    "name": "users/ekqzuevxt544jl9i",
    "displayName": "Eric Chiang",
    "email": "eric@rhombic.dev",
    "secondaryEmails": [
        "eric@obliquesecurity.com",
        "eric@oblique.security"
    ],
    "title": "Software Engineer",
    "manager": "users/sx1qezek79nqyrb0"
}

Debugging performance

The most common bug report from our initial test was “the model is confused by X.”

If you’re at a big company, you have an entire team dedicated to evaluations and prompt engineering. For this particular feature, it was just me debugging where the agent was getting tripped up, and trying to ensure new changes didn’t degrade previous performance.

One of the early features we implemented was a developer mode running our MCP server over stdio. This paired with simulated data allowed us to run prompts in a sandbox. Our first iteration was to call Claude Code’s programmatic mode with an MCP configuration and stream the results:

SERVER_BIN="$PWD/bin/oblique-server"
go build -o "$SERVER_BIN" ./cmd/oblique-server
SYSTEM_PROMPT="$PWD/mcp/eval-systemprompt.txt"
TMPDIR="$( mktemp -d )"
MCP_CONFIG="$TMPDIR/mcp_config.json"
cat >"$MCP_CONFIG" <<EOF
{
  "mcpServers": {
    "oblique": {
      "command": "$SERVER_BIN",
      "args": ["--seed-fixture=oblique","--insecure-mcp-stdio"]
    }
  }
}
EOF
cd "$TMPDIR"
claude -p "$PROMPT" \
    --mcp-config "$MCP_CONFIG" \
    --append-system-prompt-file "$SYSTEM_PROMPT" \
    --strict-mcp-config \
    --mcp-debug \
    --permission-mode dontAsk \
    --allowedTools mcp__oblique \
    --verbose \
    --output-format stream-json

Claude Code’s streaming JSON format logs a ton of internals about the harness to help understand what’s going on. We can see the results of tool search, to help determine if a tool’s description needs improvement:

{
  "type": "user",
  "tool_use_result": {
    "matches": [
      "mcp__oblique__GetUser",
      "mcp__oblique__SearchUsers",
      "mcp__oblique__SearchTeamMembers",
      "mcp__oblique__SearchTeams"
    ],
    "query": "select:mcp__oblique__GetUser,mcp__oblique__SearchUsers,mcp__oblique__SearchTeamMembers,mcp__oblique__SearchTeams"
  }
}

Or inspect successful and unsuccessful tool calls, which can point to deficiencies in our schema comments:

{
  "type": "assistant",
  "message": {
    "type": "message",
    "role": "assistant",
    "content": [
      {
        "type": "tool_use",
        "name": "mcp__oblique__SearchUsers",
        "input": {
          "filter": "email=\"eric@oblique.security\""
        }
      }
    ]
  }
}

We took that initial bash script and wrote a ~300-line Go program that consumes a library of prompts, calls Claude Code in parallel for each one, and parses the output using Go’s streaming JSON support. For every prompt, the program produces a report including the set of tools that were loaded and the tool calls:

- prompt: What teams am I on?
  tool_matches:
    - mcp__oblique__GetUserProfile
    - mcp__oblique__SearchTeams
    - mcp__oblique__SearchTeamMembers
    - mcp__oblique__BatchGetTeams
  tool_calls:
    - name: mcp__oblique__SearchTeamMembers
      input: '{"parent":"teams/-","filter":"member=\"users/me\""}'
    - name: mcp__oblique__BatchGetTeams
      input: '{"names":["teams/engineering","teams/spanish-conversation-group","teams/puzzle-box"]}'
  tool_errors: []
  result: |-
    You're on 3 teams:

    - **Engineering**
    - **Spanish conversation group** — ¡Hola! ¿Cómo estás? ¡Ven a charlar con nosotros los viernes por la mañana! All levels welcome
    - **Puzzle Box** — Make better puzzles

This view was invaluable to see if changes were steering the model in the right direction. Any time we get a bug report, we now add a prompt to our evals that attempts to replicate it, and are very quickly able to see where the model is getting stuck.

So what did we learn?

Generating our MCP server this way has produced a great feedback loop for our API documentation. Good descriptions improve agent understanding as much as humans. We made dozens of tweaks to our API comments as clients hit issues, which in turn are pulled into generated bindings for developers (and coding agents), and into future hosted API docs.

While not everything in our REST API cleanly fits into MCP, keeping them coupled with conditional generation logic lets us ensure our MCP server gets treated the same as any other surface in the product.

Generating an MCP server in Go

Wait, isn’t MCP dead?

Tool descriptions and schemas

Debugging performance

So what did we learn?

Ready to simplify access management?

Experience the joy of fewer IT tickets