Tools
Automated Cloud Migrations with Kiro and the Arm MCP Server
2025-12-20
0 views
admin
What's the Arm MCP Server? ## The Problem: A Legacy x86 Application ## Setting Up Kiro with the Arm MCP Server ## Full Automation With Kiro Steering Documents ## Running the Migration ## Verify It Works AWS Graviton has the best price-performance of any EC2 instance on AWS, and with the recent announcement of Graviton5, the value prop just keeps getting better and better. Migrating to Graviton makes both financial sense and will provide a huge performance boost to your applications. Most of the time your applications will just move over seamlessly, but what if you've got some x86-specific optimizations in your code? Great news: you don't have to manually migrate anymore. In this post, I'll show you how to use Kiro (AWS's agentic IDE) combined with the Arm MCP Server to automate the entire migration process. We're talking Docker images, SIMD intrinsics, compiler flags -- the whole thing. The Arm MCP Server implements the Model Context Protocol, which is basically a way for AI coding assistants to tap into specialized tools. When you connect it to Kiro, you suddenly have an agent that can: Let's walk through an example of how you might use the Arm MCP Server with Kiro. Pretend you've inherited a legacy benchmarking application that's deeply tied to x86. Here's the Dockerfile: This will cause problems when trying to migrate to Graviton: If you didn't spot these issues yourself, that's fine! Kiro with the Arm MCP Server will. Here's what that matrix multiplication code looks like: Here's the header file matrix_operations.h: That's a lot of x86-specific intrinsic code! But you don't have to convert it manually, Kiro + Arm can do it for you. First, you need to configure Kiro to connect to the Arm MCP Server. Kiro uses JSON configuration files for MCP servers. Create .kiro/settings/mcp.json in your project root: A few things to note: Save the file and Kiro will automatically pick up the new server. You can verify it's connected by typing /mcp in the chat—you should see arm-mcp listed with its tools. Once that's done, you can do quick checks right in the chat. Try something like: Kiro will use the Arm MCP tools and tell you that centos:6 only supports amd64. Cool, but we want to automate the whole thing. To fully automate migrations, you can use Kiro "steering documents" -- markdown files that give the AI persistent context and instructions. Instead of explaining what you want every time, you write it once and reference it. Create a file at .kiro/steering/arm-migration.md: The inclusion: manual bit means this steering doc only kicks in when you reference it. You can also use inclusion: always if you want it active all the time. Now for the fun part. In Kiro's chat, just type: That's it. Kiro will: After accepting the changes, build and test on an Arm system: You should see something like: If something breaks, just paste the error back into Kiro and it'll fix it. That's the beauty of agentic workflows -- you're not debugging alone. Happy migrating! If you run into problems or have questions, you can always email [email protected] for help. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse COMMAND_BLOCK:
FROM centos:6 # CentOS 6 reached EOL, need to use vault mirrors
RUN sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-Base.repo && \ sed -i 's|^#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Base.repo # Install EPEL repository (required for some development tools)
RUN yum install -y epel-release && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/epel.repo && \ sed -i 's|^#baseurl=http://download.fedoraproject.org/pub/epel|baseurl=http://archives.fedoraproject.org/pub/archive/epel|g' /etc/yum.repos.d/epel.repo # Install Developer Toolset 2 for better C++11 support (GCC 4.8)
RUN yum install -y centos-release-scl && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo && \ sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \ sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo # Install build tools
RUN yum install -y \ devtoolset-2-gcc \ devtoolset-2-gcc-c++ \ devtoolset-2-binutils \ make \ && yum clean all WORKDIR /app
COPY *.h *.cpp ./ # AVX2 intrinsics are used in the code
RUN scl enable devtoolset-2 "g++ -O2 -mavx2 -o benchmark \ main.cpp \ matrix_operations.cpp \ -std=c++11" CMD ["./benchmark"] Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
FROM centos:6 # CentOS 6 reached EOL, need to use vault mirrors
RUN sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-Base.repo && \ sed -i 's|^#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Base.repo # Install EPEL repository (required for some development tools)
RUN yum install -y epel-release && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/epel.repo && \ sed -i 's|^#baseurl=http://download.fedoraproject.org/pub/epel|baseurl=http://archives.fedoraproject.org/pub/archive/epel|g' /etc/yum.repos.d/epel.repo # Install Developer Toolset 2 for better C++11 support (GCC 4.8)
RUN yum install -y centos-release-scl && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo && \ sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \ sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo # Install build tools
RUN yum install -y \ devtoolset-2-gcc \ devtoolset-2-gcc-c++ \ devtoolset-2-binutils \ make \ && yum clean all WORKDIR /app
COPY *.h *.cpp ./ # AVX2 intrinsics are used in the code
RUN scl enable devtoolset-2 "g++ -O2 -mavx2 -o benchmark \ main.cpp \ matrix_operations.cpp \ -std=c++11" CMD ["./benchmark"] COMMAND_BLOCK:
FROM centos:6 # CentOS 6 reached EOL, need to use vault mirrors
RUN sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-Base.repo && \ sed -i 's|^#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-Base.repo # Install EPEL repository (required for some development tools)
RUN yum install -y epel-release && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/epel.repo && \ sed -i 's|^#baseurl=http://download.fedoraproject.org/pub/epel|baseurl=http://archives.fedoraproject.org/pub/archive/epel|g' /etc/yum.repos.d/epel.repo # Install Developer Toolset 2 for better C++11 support (GCC 4.8)
RUN yum install -y centos-release-scl && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \ sed -i 's|^mirrorlist=|#mirrorlist=|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo && \ sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl.repo && \ sed -i 's|^# baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-SCLo-scl-rh.repo # Install build tools
RUN yum install -y \ devtoolset-2-gcc \ devtoolset-2-gcc-c++ \ devtoolset-2-binutils \ make \ && yum clean all WORKDIR /app
COPY *.h *.cpp ./ # AVX2 intrinsics are used in the code
RUN scl enable devtoolset-2 "g++ -O2 -mavx2 -o benchmark \ main.cpp \ matrix_operations.cpp \ -std=c++11" CMD ["./benchmark"] COMMAND_BLOCK:
#include "matrix_operations.h"
#include <iostream>
#include <random>
#include <chrono>
#include <stdexcept>
#include <immintrin.h> // AVX2 intrinsics Matrix::Matrix(size_t r, size_t c) : rows(r), cols(c) { data.resize(rows, std::vector<double>(cols, 0.0));
} void Matrix::randomize() { std::random_device rd; std::mt19937 gen(rd()); std::uniform_real_distribution<> dis(0.0, 10.0); for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < cols; j++) { data[i][j] = dis(gen); } }
} Matrix Matrix::multiply(const Matrix& other) const { if (cols != other.rows) { throw std::runtime_error("Invalid matrix dimensions for multiplication"); } Matrix result(rows, other.cols); // x86-64 optimized using AVX2 for double-precision for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < other.cols; j++) { __m256d sum_vec = _mm256_setzero_pd(); size_t k = 0; // Process 4 elements at a time with AVX2 for (; k + 3 < cols; k += 4) { __m256d a_vec = _mm256_loadu_pd(&data[i][k]); __m256d b_vec = _mm256_set_pd( other.data[k+3][j], other.data[k+2][j], other.data[k+1][j], other.data[k][j] ); sum_vec = _mm256_add_pd(sum_vec, _mm256_mul_pd(a_vec, b_vec)); } // Horizontal add using AVX __m128d sum_high = _mm256_extractf128_pd(sum_vec, 1); __m128d sum_low = _mm256_castpd256_pd128(sum_vec); __m128d sum_128 = _mm_add_pd(sum_low, sum_high); double sum_arr[2]; _mm_storeu_pd(sum_arr, sum_128); double sum = sum_arr[0] + sum_arr[1]; // Handle remaining elements for (; k < cols; k++) { sum += data[i][k] * other.data[k][j]; } result.data[i][j] = sum; } } return result;
} double Matrix::sum() const { double total = 0.0; for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < cols; j++) { total += data[i][j]; } } return total;
} void benchmark_matrix_ops() { std::cout << "\n=== Matrix Multiplication Benchmark ===" << std::endl; const size_t size = 200; Matrix a(size, size); Matrix b(size, size); a.randomize(); b.randomize(); auto start = std::chrono::high_resolution_clock::now(); Matrix c = a.multiply(b); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start); std::cout << "Matrix size: " << size << "x" << size << std::endl; std::cout << "Time: " << duration.count() << " ms" << std::endl; std::cout << "Result sum: " << c.sum() << std::endl;
} Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
#include "matrix_operations.h"
#include <iostream>
#include <random>
#include <chrono>
#include <stdexcept>
#include <immintrin.h> // AVX2 intrinsics Matrix::Matrix(size_t r, size_t c) : rows(r), cols(c) { data.resize(rows, std::vector<double>(cols, 0.0));
} void Matrix::randomize() { std::random_device rd; std::mt19937 gen(rd()); std::uniform_real_distribution<> dis(0.0, 10.0); for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < cols; j++) { data[i][j] = dis(gen); } }
} Matrix Matrix::multiply(const Matrix& other) const { if (cols != other.rows) { throw std::runtime_error("Invalid matrix dimensions for multiplication"); } Matrix result(rows, other.cols); // x86-64 optimized using AVX2 for double-precision for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < other.cols; j++) { __m256d sum_vec = _mm256_setzero_pd(); size_t k = 0; // Process 4 elements at a time with AVX2 for (; k + 3 < cols; k += 4) { __m256d a_vec = _mm256_loadu_pd(&data[i][k]); __m256d b_vec = _mm256_set_pd( other.data[k+3][j], other.data[k+2][j], other.data[k+1][j], other.data[k][j] ); sum_vec = _mm256_add_pd(sum_vec, _mm256_mul_pd(a_vec, b_vec)); } // Horizontal add using AVX __m128d sum_high = _mm256_extractf128_pd(sum_vec, 1); __m128d sum_low = _mm256_castpd256_pd128(sum_vec); __m128d sum_128 = _mm_add_pd(sum_low, sum_high); double sum_arr[2]; _mm_storeu_pd(sum_arr, sum_128); double sum = sum_arr[0] + sum_arr[1]; // Handle remaining elements for (; k < cols; k++) { sum += data[i][k] * other.data[k][j]; } result.data[i][j] = sum; } } return result;
} double Matrix::sum() const { double total = 0.0; for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < cols; j++) { total += data[i][j]; } } return total;
} void benchmark_matrix_ops() { std::cout << "\n=== Matrix Multiplication Benchmark ===" << std::endl; const size_t size = 200; Matrix a(size, size); Matrix b(size, size); a.randomize(); b.randomize(); auto start = std::chrono::high_resolution_clock::now(); Matrix c = a.multiply(b); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start); std::cout << "Matrix size: " << size << "x" << size << std::endl; std::cout << "Time: " << duration.count() << " ms" << std::endl; std::cout << "Result sum: " << c.sum() << std::endl;
} COMMAND_BLOCK:
#include "matrix_operations.h"
#include <iostream>
#include <random>
#include <chrono>
#include <stdexcept>
#include <immintrin.h> // AVX2 intrinsics Matrix::Matrix(size_t r, size_t c) : rows(r), cols(c) { data.resize(rows, std::vector<double>(cols, 0.0));
} void Matrix::randomize() { std::random_device rd; std::mt19937 gen(rd()); std::uniform_real_distribution<> dis(0.0, 10.0); for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < cols; j++) { data[i][j] = dis(gen); } }
} Matrix Matrix::multiply(const Matrix& other) const { if (cols != other.rows) { throw std::runtime_error("Invalid matrix dimensions for multiplication"); } Matrix result(rows, other.cols); // x86-64 optimized using AVX2 for double-precision for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < other.cols; j++) { __m256d sum_vec = _mm256_setzero_pd(); size_t k = 0; // Process 4 elements at a time with AVX2 for (; k + 3 < cols; k += 4) { __m256d a_vec = _mm256_loadu_pd(&data[i][k]); __m256d b_vec = _mm256_set_pd( other.data[k+3][j], other.data[k+2][j], other.data[k+1][j], other.data[k][j] ); sum_vec = _mm256_add_pd(sum_vec, _mm256_mul_pd(a_vec, b_vec)); } // Horizontal add using AVX __m128d sum_high = _mm256_extractf128_pd(sum_vec, 1); __m128d sum_low = _mm256_castpd256_pd128(sum_vec); __m128d sum_128 = _mm_add_pd(sum_low, sum_high); double sum_arr[2]; _mm_storeu_pd(sum_arr, sum_128); double sum = sum_arr[0] + sum_arr[1]; // Handle remaining elements for (; k < cols; k++) { sum += data[i][k] * other.data[k][j]; } result.data[i][j] = sum; } } return result;
} double Matrix::sum() const { double total = 0.0; for (size_t i = 0; i < rows; i++) { for (size_t j = 0; j < cols; j++) { total += data[i][j]; } } return total;
} void benchmark_matrix_ops() { std::cout << "\n=== Matrix Multiplication Benchmark ===" << std::endl; const size_t size = 200; Matrix a(size, size); Matrix b(size, size); a.randomize(); b.randomize(); auto start = std::chrono::high_resolution_clock::now(); Matrix c = a.multiply(b); auto end = std::chrono::high_resolution_clock::now(); auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - start); std::cout << "Matrix size: " << size << "x" << size << std::endl; std::cout << "Time: " << duration.count() << " ms" << std::endl; std::cout << "Result sum: " << c.sum() << std::endl;
} COMMAND_BLOCK:
#ifndef MATRIX_OPERATIONS_H
#define MATRIX_OPERATIONS_H #include <vector>
#include <cstddef> class Matrix {
private: std::vector<std::vector<double>> data; size_t rows; size_t cols; public: Matrix(size_t r, size_t c); void randomize(); Matrix multiply(const Matrix& other) const; double sum() const; size_t getRows() const { return rows; } size_t getCols() const { return cols; }
}; void benchmark_matrix_ops(); #endif // MATRIX_OPERATIONS_H Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK:
#ifndef MATRIX_OPERATIONS_H
#define MATRIX_OPERATIONS_H #include <vector>
#include <cstddef> class Matrix {
private: std::vector<std::vector<double>> data; size_t rows; size_t cols; public: Matrix(size_t r, size_t c); void randomize(); Matrix multiply(const Matrix& other) const; double sum() const; size_t getRows() const { return rows; } size_t getCols() const { return cols; }
}; void benchmark_matrix_ops(); #endif // MATRIX_OPERATIONS_H COMMAND_BLOCK:
#ifndef MATRIX_OPERATIONS_H
#define MATRIX_OPERATIONS_H #include <vector>
#include <cstddef> class Matrix {
private: std::vector<std::vector<double>> data; size_t rows; size_t cols; public: Matrix(size_t r, size_t c); void randomize(); Matrix multiply(const Matrix& other) const; double sum() const; size_t getRows() const { return rows; } size_t getCols() const { return cols; }
}; void benchmark_matrix_ops(); #endif // MATRIX_OPERATIONS_H CODE_BLOCK:
#include "matrix_operations.h"
#include <iostream> int main() { std::cout << "x86-64 AVX2 Matrix Operations Benchmark" << std::endl; std::cout << "========================================" << std::endl; #if defined(__x86_64__) || defined(_M_X64) std::cout << "Running on x86-64 architecture with AVX2 optimizations" << std::endl;
#else #error "This code requires x86-64 architecture with AVX2 support"
#endif benchmark_matrix_ops(); return 0;
} Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
#include "matrix_operations.h"
#include <iostream> int main() { std::cout << "x86-64 AVX2 Matrix Operations Benchmark" << std::endl; std::cout << "========================================" << std::endl; #if defined(__x86_64__) || defined(_M_X64) std::cout << "Running on x86-64 architecture with AVX2 optimizations" << std::endl;
#else #error "This code requires x86-64 architecture with AVX2 support"
#endif benchmark_matrix_ops(); return 0;
} CODE_BLOCK:
#include "matrix_operations.h"
#include <iostream> int main() { std::cout << "x86-64 AVX2 Matrix Operations Benchmark" << std::endl; std::cout << "========================================" << std::endl; #if defined(__x86_64__) || defined(_M_X64) std::cout << "Running on x86-64 architecture with AVX2 optimizations" << std::endl;
#else #error "This code requires x86-64 architecture with AVX2 support"
#endif benchmark_matrix_ops(); return 0;
} CODE_BLOCK:
{ "mcpServers": { "arm-mcp": { "command": "docker", "args": [ "run", "-i", "--rm", "-v", "/path/to/your/code:/workspace", "armlimited/arm-mcp:1.0.1" ] } }
} Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
{ "mcpServers": { "arm-mcp": { "command": "docker", "args": [ "run", "-i", "--rm", "-v", "/path/to/your/code:/workspace", "armlimited/arm-mcp:1.0.1" ] } }
} CODE_BLOCK:
{ "mcpServers": { "arm-mcp": { "command": "docker", "args": [ "run", "-i", "--rm", "-v", "/path/to/your/code:/workspace", "armlimited/arm-mcp:1.0.1" ] } }
} CODE_BLOCK:
Check the base image in the Dockerfile for Arm compatibility Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
Check the base image in the Dockerfile for Arm compatibility CODE_BLOCK:
Check the base image in the Dockerfile for Arm compatibility CODE_BLOCK:
---
inclusion: manual
--- Your goal is to migrate a codebase from x86 to Arm. Use the MCP server tools to help you with this. Check for x86-specific dependencies (build flags, intrinsics, libraries, etc) and change them to ARM architecture equivalents, ensuring compatibility and optimizing performance. Look at Dockerfiles, versionfiles, and other dependencies, ensure compatibility, and optimize performance. Steps to follow:
* Look in all Dockerfiles and use the check_image and/or skopeo tools to verify ARM compatibility, changing the base image if necessary.
* Look at the packages installed by the Dockerfile and send each package to the knowledge_base_search tool to check each package for ARM compatibility. If a package is not compatible, change it to a compatible version. When invoking the tool, explicitly ask "Is [package] compatible with ARM architecture?" where [package] is the name of the package.
* Look at the contents of any requirements.txt files line-by-line and send each line to the knowledge_base_search tool to check each package for ARM compatibility. If a package is not compatible, change it to a compatible version.
* Look at the codebase that you have access to, and determine what the language used is.
* Run the migrate_ease_scan tool on the codebase, using the appropriate language scanner based on what language the codebase uses, and apply the suggested changes.
* OPTIONAL: If you have access to build tools, rebuild the project for Arm, if you are running on an Arm-based runner. Fix any compilation errors.
* OPTIONAL: If you have access to any benchmarks or integration tests for the codebase, run these and report the timing improvements to the user. Pitfalls to avoid: * Make sure that you don't confuse a software version with a language wrapper package version -- i.e. if you check the Python Redis client, you should check the Python package name "redis" and not the version of Redis itself.
* NEON lane indices must be compile-time constants, not variables. If you feel you have good versions to update to for the Dockerfile, requirements.txt, etc. immediately change the files, no need to ask for confirmation. Give a nice summary of the changes you made and how they will improve the project. Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
---
inclusion: manual
--- Your goal is to migrate a codebase from x86 to Arm. Use the MCP server tools to help you with this. Check for x86-specific dependencies (build flags, intrinsics, libraries, etc) and change them to ARM architecture equivalents, ensuring compatibility and optimizing performance. Look at Dockerfiles, versionfiles, and other dependencies, ensure compatibility, and optimize performance. Steps to follow:
* Look in all Dockerfiles and use the check_image and/or skopeo tools to verify ARM compatibility, changing the base image if necessary.
* Look at the packages installed by the Dockerfile and send each package to the knowledge_base_search tool to check each package for ARM compatibility. If a package is not compatible, change it to a compatible version. When invoking the tool, explicitly ask "Is [package] compatible with ARM architecture?" where [package] is the name of the package.
* Look at the contents of any requirements.txt files line-by-line and send each line to the knowledge_base_search tool to check each package for ARM compatibility. If a package is not compatible, change it to a compatible version.
* Look at the codebase that you have access to, and determine what the language used is.
* Run the migrate_ease_scan tool on the codebase, using the appropriate language scanner based on what language the codebase uses, and apply the suggested changes.
* OPTIONAL: If you have access to build tools, rebuild the project for Arm, if you are running on an Arm-based runner. Fix any compilation errors.
* OPTIONAL: If you have access to any benchmarks or integration tests for the codebase, run these and report the timing improvements to the user. Pitfalls to avoid: * Make sure that you don't confuse a software version with a language wrapper package version -- i.e. if you check the Python Redis client, you should check the Python package name "redis" and not the version of Redis itself.
* NEON lane indices must be compile-time constants, not variables. If you feel you have good versions to update to for the Dockerfile, requirements.txt, etc. immediately change the files, no need to ask for confirmation. Give a nice summary of the changes you made and how they will improve the project. CODE_BLOCK:
---
inclusion: manual
--- Your goal is to migrate a codebase from x86 to Arm. Use the MCP server tools to help you with this. Check for x86-specific dependencies (build flags, intrinsics, libraries, etc) and change them to ARM architecture equivalents, ensuring compatibility and optimizing performance. Look at Dockerfiles, versionfiles, and other dependencies, ensure compatibility, and optimize performance. Steps to follow:
* Look in all Dockerfiles and use the check_image and/or skopeo tools to verify ARM compatibility, changing the base image if necessary.
* Look at the packages installed by the Dockerfile and send each package to the knowledge_base_search tool to check each package for ARM compatibility. If a package is not compatible, change it to a compatible version. When invoking the tool, explicitly ask "Is [package] compatible with ARM architecture?" where [package] is the name of the package.
* Look at the contents of any requirements.txt files line-by-line and send each line to the knowledge_base_search tool to check each package for ARM compatibility. If a package is not compatible, change it to a compatible version.
* Look at the codebase that you have access to, and determine what the language used is.
* Run the migrate_ease_scan tool on the codebase, using the appropriate language scanner based on what language the codebase uses, and apply the suggested changes.
* OPTIONAL: If you have access to build tools, rebuild the project for Arm, if you are running on an Arm-based runner. Fix any compilation errors.
* OPTIONAL: If you have access to any benchmarks or integration tests for the codebase, run these and report the timing improvements to the user. Pitfalls to avoid: * Make sure that you don't confuse a software version with a language wrapper package version -- i.e. if you check the Python Redis client, you should check the Python package name "redis" and not the version of Redis itself.
* NEON lane indices must be compile-time constants, not variables. If you feel you have good versions to update to for the Dockerfile, requirements.txt, etc. immediately change the files, no need to ask for confirmation. Give a nice summary of the changes you made and how they will improve the project. CODE_BLOCK:
#arm-migration Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
#arm-migration CODE_BLOCK:
#arm-migration CODE_BLOCK:
g++ -O2 -o benchmark matrix_operations.cpp main.cpp -std=c++11
./benchmark Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
g++ -O2 -o benchmark matrix_operations.cpp main.cpp -std=c++11
./benchmark CODE_BLOCK:
g++ -O2 -o benchmark matrix_operations.cpp main.cpp -std=c++11
./benchmark CODE_BLOCK:
ARM-Optimized Matrix Operations Benchmark
==========================================
Running on ARM64 architecture with NEON optimizations === Matrix Multiplication Benchmark ===
Matrix size: 200x200
Time: 12 ms
Result sum: 2.01203e+08 Enter fullscreen mode Exit fullscreen mode CODE_BLOCK:
ARM-Optimized Matrix Operations Benchmark
==========================================
Running on ARM64 architecture with NEON optimizations === Matrix Multiplication Benchmark ===
Matrix size: 200x200
Time: 12 ms
Result sum: 2.01203e+08 CODE_BLOCK:
ARM-Optimized Matrix Operations Benchmark
==========================================
Running on ARM64 architecture with NEON optimizations === Matrix Multiplication Benchmark ===
Matrix size: 200x200
Time: 12 ms
Result sum: 2.01203e+08 - Check Docker images for arm64 support without you having to dig through manifests
- Scan your codebase for x86-specific code (intrinsics, build flags, etc.)
- Search Arm's knowledge base for migration guidance and intrinsic equivalents
- Analyze assembly for performance characteristics - centos:6 doesn't support arm64
- -mavx2 is an x86-only compiler flag
- The code uses AVX2 intrinsics (spoiler: won't compile on Arm) - The command runs the Arm MCP Server via Docker
- -v "/path/to/your/code:/workspace" mounts your current project directory so the scanner can access your code. Replace /path/to/your/code with your actual path. - Find your Dockerfile and check centos:6 for arm64 support (it'll fail)
- Suggest baseimage replacements
- Scan your C++ code with migrate_ease_scan
- Find all those AVX2 intrinsics and convert them to NEON
- Update the compiler flags
- Give you a summary of everything it changed
how-totutorialguidedev.toaimlcentosserverdockerpythonssl