How helpful is AI?
Do large language models (AI) make you 3x faster or only 3% faster? The answer depends on the quality of the work you are producing.
If you need something like a stock photo but not much beyond that, AI can make you 10x faster.
If you need a picture taken at the right time of the right person, AI doesn’t help you much.
If you need a piece of software that an intern could have written, AI can do it 10x faster than the intern.
If you need a piece of software that only 10 engineers in the country can understand, AI doesn’t help you much.
The effect is predictable: finding work if your skill level is low becomes more difficult. However, if you are a highly skilled individual, you can eliminate much of the boilerplate work and focus on what matters. Thus, elite people are going to become even more productive.
source
Chips and Cheese
Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Dynamic Register Allocation on AMD's RDNA 4 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Chips and Cheese
Inside Nvidia's GeForce 6000 Series
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Inside Nvidia's GeForce 6000 Series
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Et Tu, Grammarly? https://dbushell.com/2025/03/29/et-tu-grammarly/
Chips and Cheese
An Interview with Oxide's Bryan Cantrill
#ChipAndCheese
Telegraph | source
(author: George Cozma)
An Interview with Oxide's Bryan Cantrill
#ChipAndCheese
Telegraph | source
(author: George Cozma)
Matt Keeter
The Prospero Challenge
Evaluate a 7866-clause math expression for fame and glory
source
(author: Matt Keeter ([email protected]))
The Prospero Challenge
Evaluate a 7866-clause math expression for fame and glory
source
(author: Matt Keeter ([email protected]))
属于CYY自己的世界
在 Ubuntu 22.04 的阿里云 ECS 上将 rootfs 转换为 btrfs
source
(author: Yangyu Chen)
在 Ubuntu 22.04 的阿里云 ECS 上将 rootfs 转换为 btrfs
# ssh 到服务器
sudo su
cd /boot
wget http://mirrors.cqu.edu.cn/debian/dists/stable/main/installer-amd64/current/images/netboot/debian-installer/amd64/initrd.gz
wget http://mirrors.cqu.edu.cn/debian/dists/stable/main/installer-amd64/current/images/netboot/debian-installer/amd64/linux
sed -i 's/GRUB_TIMEOUT=5/GRUB_DEFAULT=5/g' /etc/default/grub
sed -i 's/GRUB_TIMEOUT_STYLE=hidden/GRUB_TIMEOUT_STYLE=menu/g' /etc/default/grub
update-grub
# 阿里云后台打开vnc控制台
reboot
# 重启,在grub启动菜单处,按e,linux linux; initrd initrd.gz,F10
# 一直完成到设置区域,镜像站(国内服务器记得选国内镜像站),root用户及密码
# 在 Partition disk 处停下,选go back,然后点execute a shell
cat /proc/partitions
# 确定仍然为/dev/vda3
wget https://mirrors.tnonline.net/btrfs/btrfs-progs/x86_64/btrfs-progs-6.9.2-x86_64-static/btrfs-progs-6.9.2-x86_64-static.tar.gz
gunzip btrfs-progs-6.9.2-x86_64-static.tar.gz
tar -xvf btrfs-progs-6.9.2-x86_64-static.tar
fsck.ext4 /dev/vda3 -f
blkid
# 记下来 /dev/vda3 的 UUID (非常重要)
# UUID="a9699f99-5614-4444-be92-d2ef6cfdbaf6"
./btrfs-convert.static /dev/vda3
./btrfstune.static -U a9699f99-5614-4444-be92-d2ef6cfdbaf6 /dev/vda3
reboot -f
开机后重新执行 sudo update-grub
source
(author: Yangyu Chen)
Chips and Cheese
RDNA 4's "Out-of-Order" Memory Accesses
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
RDNA 4's "Out-of-Order" Memory Accesses
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Use Long Options in Scripts https://matklad.github.io/2025/03/21/use-long-options-in-scripts.html
Chips and Cheese
Looking Ahead at Intel’s Xe3 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
Looking Ahead at Intel’s Xe3 GPU Architecture
#ChipAndCheese
Telegraph | source
(author: Chester Lam)
#PL #Rust #OS #Linux | Ubuntu 从 25.10 开始将会使用 uutils 替代 GNU coreutils。
https://www.osnews.com/story/141908/ubuntu-to-replace-classic-coreutils-and-more-with-new-rust-based-alternatives/
https://www.osnews.com/story/141908/ubuntu-to-replace-classic-coreutils-and-more-with-new-rust-based-alternatives/
Harry Chen’s Blog
又踩了 CMap 的坑——探究字体与 PDF 文件中的字符映射表
Telegraph | source
(author: Shengqi Chen ([email protected]))
又踩了 CMap 的坑——探究字体与 PDF 文件中的字符映射表
Telegraph | source
(author: Shengqi Chen ([email protected]))
Daniel Lemire's blog
Speeding up C++ code with template lambdas
Let us consider a simple C++ function which divides all values in a range of integers:
If the divisor d is known at compile-time, this function can be much faster. E.g., if d is 2, the compiler might optimize away the division and use a shift and a few cheap instructions instead. The same is true with all compile-time constant: the compiler can often do better knowing the constant.
In C++, a template function is defined using the template keyword followed by a parameter (usually a type parameter) enclosed in angle brackets < >. The template parameter acts as a placeholder that gets replaced with actual data type when the function is called.
In C++, you can turn the division parameter into a template parameter:
The template function is not itself a function, but rather a recipe to generate functions: we provide the integer d and a function is created. This allows the compiler to work with a compile-time constant, producing faster code.
If you expect the divisor to be between 2 and 6, you can call the template function from a general-purpose function like so:
You could do it with a switch/case if you prefer but it does not simplify the code significantly.
Unfortunately we have to expose a template function, which creates noise in our code base. We would prefer to keep all the logic inside one function. We can do so with lambda functions.
In C++, a lambda function(or lambda expression) is an anonymous, inline function that you can define on-the-fly, typically for short-term use. Starting with C++20, you have template lambda expressions.
We can almost do it like so:
Unfortunately, it does not quite work. Given template lambda expressions, you cannot directly pass template parameters, and you need something ugly (‘template operator()<params>’):
In practice, it might still be a good choice. It keeps all the messy optimization hidden inside your function.
source
Speeding up C++ code with template lambdas
Let us consider a simple C++ function which divides all values in a range of integers:
void divide(std::span<int> i, int d) {
for (auto& value : i) {
value /= d;
}
}
If the divisor d is known at compile-time, this function can be much faster. E.g., if d is 2, the compiler might optimize away the division and use a shift and a few cheap instructions instead. The same is true with all compile-time constant: the compiler can often do better knowing the constant.
In C++, a template function is defined using the template keyword followed by a parameter (usually a type parameter) enclosed in angle brackets < >. The template parameter acts as a placeholder that gets replaced with actual data type when the function is called.
In C++, you can turn the division parameter into a template parameter:
template <int d>
void divide(std::span<int> i) {
for (auto& value : i) {
value /= d;
}
}
The template function is not itself a function, but rather a recipe to generate functions: we provide the integer d and a function is created. This allows the compiler to work with a compile-time constant, producing faster code.
If you expect the divisor to be between 2 and 6, you can call the template function from a general-purpose function like so:
void divide_fast(std::span<int> i, int d) {
if(d == 2) {
return divide<2>(i);
}
if(d == 3) {
return divide<3>(i);
}
if(d == 4) {
return divide<4>(i);
}
if(d == 5) {
return divide<5>(i);
}
if(d == 6) {
return divide<6>(i);
}
for (auto& value : i) {
value /= d;
}
}
You could do it with a switch/case if you prefer but it does not simplify the code significantly.
Unfortunately we have to expose a template function, which creates noise in our code base. We would prefer to keep all the logic inside one function. We can do so with lambda functions.
In C++, a lambda function(or lambda expression) is an anonymous, inline function that you can define on-the-fly, typically for short-term use. Starting with C++20, you have template lambda expressions.
We can almost do it like so:
void divide_fast(std::span<int> i, int d) {
auto f = [&i]<int divisor>() {
for (auto& value : i) {
value /= divisor;
}
};
if(d == 2) {
return f<2>();
}
if(d == 3) {
return f<3>();
}
if(d == 4) {
return f<4>();
}
if(d == 5) {
return f<5>();
}
if(d == 6) {
return f<6>();
}
for (auto& value : i) {
value /= d;
}
}
Unfortunately, it does not quite work. Given template lambda expressions, you cannot directly pass template parameters, and you need something ugly (‘template operator()<params>’):
void divide_fast(std::span<int> i, int d) {
auto f = [&i]<int divisor>() {
for (auto& value : i) {
value /= divisor;
}
};
if(d == 2) {
return f.template operator()<2>();
}
if(d == 3) {
return f.template operator()<3>();
}
if(d == 4) {
return f.template operator()<4>();
}
if(d == 5) {
return f.template operator()<5>();
}
if(d == 6) {
return f.template operator()<6>();
}
for (auto& value : i) {
value /= d;
}
}
In practice, it might still be a good choice. It keeps all the messy optimization hidden inside your function.
source