Chips and Cheese
Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Chips and Cheese
Inside Nvidia's GeForce 6000 Series
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Inside Nvidia's GeForce 6000 Series
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Et Tu, Grammarly? https://dbushell.com/2025/03/29/et-tu-grammarly/
Chips and Cheese
An Interview with Oxide's Bryan Cantrill
#ChipAndCheese
Telegraph | source
(author: George Cozma)
An Interview with Oxide's Bryan Cantrill
#ChipAndCheese
Telegraph | source
(author: George Cozma)
Matt Keeter
The Prospero Challenge
Evaluate a 7866-clause math expression for fame and glory
source
(author: Matt Keeter ([email protected]))
The Prospero Challenge
Evaluate a 7866-clause math expression for fame and glory
source
(author: Matt Keeter ([email protected]))
属于CYY自己的世界
在 Ubuntu 22.04 的阿里云 ECS 上将 rootfs 转换为 btrfs
source
(author: Yangyu Chen)
在 Ubuntu 22.04 的阿里云 ECS 上将 rootfs 转换为 btrfs
# ssh 到服务器
sudo su
cd /boot
wget http://mirrors.cqu.edu.cn/debian/dists/stable/main/installer-amd64/current/images/netboot/debian-installer/amd64/initrd.gz
wget http://mirrors.cqu.edu.cn/debian/dists/stable/main/installer-amd64/current/images/netboot/debian-installer/amd64/linux
sed -i 's/GRUB_TIMEOUT=5/GRUB_DEFAULT=5/g' /etc/default/grub
sed -i 's/GRUB_TIMEOUT_STYLE=hidden/GRUB_TIMEOUT_STYLE=menu/g' /etc/default/grub
update-grub
# 阿里云后台打开vnc控制台
reboot
# 重启,在grub启动菜单处,按e,linux linux; initrd initrd.gz,F10
# 一直完成到设置区域,镜像站(国内服务器记得选国内镜像站),root用户及密码
# 在 Partition disk 处停下,选go back,然后点execute a shell
cat /proc/partitions
# 确定仍然为/dev/vda3
wget https://mirrors.tnonline.net/btrfs/btrfs-progs/x86_64/btrfs-progs-6.9.2-x86_64-static/btrfs-progs-6.9.2-x86_64-static.tar.gz
gunzip btrfs-progs-6.9.2-x86_64-static.tar.gz
tar -xvf btrfs-progs-6.9.2-x86_64-static.tar
fsck.ext4 /dev/vda3 -f
blkid
# 记下来 /dev/vda3 的 UUID (非常重要)
# UUID="a9699f99-5614-4444-be92-d2ef6cfdbaf6"
./btrfs-convert.static /dev/vda3
./btrfstune.static -U a9699f99-5614-4444-be92-d2ef6cfdbaf6 /dev/vda3
reboot -f
开机后重新执行 sudo update-grub
source
(author: Yangyu Chen)
Chips and Cheese
RDNA 4's "Out-of-Order" Memory Accesses
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
RDNA 4's "Out-of-Order" Memory Accesses
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Use Long Options in Scripts https://matklad.github.io/2025/03/21/use-long-options-in-scripts.html
Chips and Cheese
Looking Ahead at Intel’s Xe3 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Looking Ahead at Intel’s Xe3 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
#PL #Rust #OS #Linux | Ubuntu 从 25.10 开始将会使用 uutils 替代 GNU coreutils。
https://www.osnews.com/story/141908/ubuntu-to-replace-classic-coreutils-and-more-with-new-rust-based-alternatives/
https://www.osnews.com/story/141908/ubuntu-to-replace-classic-coreutils-and-more-with-new-rust-based-alternatives/
Harry Chen’s Blog
又踩了 CMap 的坑——探究字体与 PDF 文件中的字符映射表
Telegraph | source
(author: Shengqi Chen ([email protected]))
又踩了 CMap 的坑——探究字体与 PDF 文件中的字符映射表
Telegraph | source
(author: Shengqi Chen ([email protected]))
Daniel Lemire's blog
Speeding up C++ code with template lambdas
Let us consider a simple C++ function which divides all values in a range of integers:
If the divisor d is known at compile-time, this function can be much faster. E.g., if d is 2, the compiler might optimize away the division and use a shift and a few cheap instructions instead. The same is true with all compile-time constant: the compiler can often do better knowing the constant.
In C++, a template function is defined using the template keyword followed by a parameter (usually a type parameter) enclosed in angle brackets < >. The template parameter acts as a placeholder that gets replaced with actual data type when the function is called.
In C++, you can turn the division parameter into a template parameter:
The template function is not itself a function, but rather a recipe to generate functions: we provide the integer d and a function is created. This allows the compiler to work with a compile-time constant, producing faster code.
If you expect the divisor to be between 2 and 6, you can call the template function from a general-purpose function like so:
You could do it with a switch/case if you prefer but it does not simplify the code significantly.
Unfortunately we have to expose a template function, which creates noise in our code base. We would prefer to keep all the logic inside one function. We can do so with lambda functions.
In C++, a lambda function(or lambda expression) is an anonymous, inline function that you can define on-the-fly, typically for short-term use. Starting with C++20, you have template lambda expressions.
We can almost do it like so:
Unfortunately, it does not quite work. Given template lambda expressions, you cannot directly pass template parameters, and you need something ugly (‘template operator()<params>’):
In practice, it might still be a good choice. It keeps all the messy optimization hidden inside your function.
source
Speeding up C++ code with template lambdas
Let us consider a simple C++ function which divides all values in a range of integers:
void divide(std::span<int> i, int d) {
for (auto& value : i) {
value /= d;
}
}
If the divisor d is known at compile-time, this function can be much faster. E.g., if d is 2, the compiler might optimize away the division and use a shift and a few cheap instructions instead. The same is true with all compile-time constant: the compiler can often do better knowing the constant.
In C++, a template function is defined using the template keyword followed by a parameter (usually a type parameter) enclosed in angle brackets < >. The template parameter acts as a placeholder that gets replaced with actual data type when the function is called.
In C++, you can turn the division parameter into a template parameter:
template <int d>
void divide(std::span<int> i) {
for (auto& value : i) {
value /= d;
}
}
The template function is not itself a function, but rather a recipe to generate functions: we provide the integer d and a function is created. This allows the compiler to work with a compile-time constant, producing faster code.
If you expect the divisor to be between 2 and 6, you can call the template function from a general-purpose function like so:
void divide_fast(std::span<int> i, int d) {
if(d == 2) {
return divide<2>(i);
}
if(d == 3) {
return divide<3>(i);
}
if(d == 4) {
return divide<4>(i);
}
if(d == 5) {
return divide<5>(i);
}
if(d == 6) {
return divide<6>(i);
}
for (auto& value : i) {
value /= d;
}
}
You could do it with a switch/case if you prefer but it does not simplify the code significantly.
Unfortunately we have to expose a template function, which creates noise in our code base. We would prefer to keep all the logic inside one function. We can do so with lambda functions.
In C++, a lambda function(or lambda expression) is an anonymous, inline function that you can define on-the-fly, typically for short-term use. Starting with C++20, you have template lambda expressions.
We can almost do it like so:
void divide_fast(std::span<int> i, int d) {
auto f = [&i]<int divisor>() {
for (auto& value : i) {
value /= divisor;
}
};
if(d == 2) {
return f<2>();
}
if(d == 3) {
return f<3>();
}
if(d == 4) {
return f<4>();
}
if(d == 5) {
return f<5>();
}
if(d == 6) {
return f<6>();
}
for (auto& value : i) {
value /= d;
}
}
Unfortunately, it does not quite work. Given template lambda expressions, you cannot directly pass template parameters, and you need something ugly (‘template operator()<params>’):
void divide_fast(std::span<int> i, int d) {
auto f = [&i]<int divisor>() {
for (auto& value : i) {
value /= divisor;
}
};
if(d == 2) {
return f.template operator()<2>();
}
if(d == 3) {
return f.template operator()<3>();
}
if(d == 4) {
return f.template operator()<4>();
}
if(d == 5) {
return f.template operator()<5>();
}
if(d == 6) {
return f.template operator()<6>();
}
for (auto& value : i) {
value /= d;
}
}
In practice, it might still be a good choice. It keeps all the messy optimization hidden inside your function.
source