@EastonMan 看的新闻
+碎碎念
+膜大佬
+偶尔猫猫
+伊斯通听的歌
Daniel Lemire's blog
Regular expressions can blow up!

Regular expressions, often abbreviated as regex, are a powerful tool for pattern matching within text. For example, the expression
\d*\.?\d+

would match a positive number such as 1.1 or 12. If designed and tested with care, regular expressions may be used in mission-critical software. However, their power comes with a risk: it is possible to design small regular expressions that are very expensive to run on even small strings.

To make matters more complicated, there are several regular-expression engines, and they differ in their syntax and implementation. Let me consider the regular-expression engine used by the C++ language under Linux (libgc++).

Consider the following program. It uses the string “Everyone loves Lucy.” and d a regex pattern (.*+s}}@w. I am not exactly sure what this pattern is supposed to do, but it is accepted by the engine. The program then uses std::regex_search to look for matches of this pattern within the string, storing potential matches in a std::smatch object, and outputs whether a match was found or not.
#include <iostream>
#include <regex>

int main() {
    std::string text = "Everyone loves Lucy.";
    std::regex pattern(R"(.*+s}}@w)");
    // Perform regex search
    std::smatch match;
    bool found = std::regex_search(text, match, pattern);
    std::cout << "Regex search result: "
          << (found ? "Match found" : "No match") << std::endl;
    return 0;
}

Using GCC 12 and a recent Linux server, this program takes about 7 minutes to run.

In other words, a bad regular expression can crash your systems. It is not just theoretical, the Cloudflare corporation suffered a major outage in 2019 due to a bad regular expression.

Use regular expressions with care.

source
Matt Keeter
Guided by the beauty of our test suite

source
(author: Matt Keeter ([email protected]))
Arch Linux: Recent news updates
Critical rsync security release 3.4.0

We'd like to raise awareness about the rsync security release version 3.4.0-1 as described in our advisory ASA-202501-1.

An attacker only requires anonymous read access to a vulnerable rsync server, such as a public mirror, to execute arbitrary code on the machine the server is running on. Additionally, attackers can take control of an affected server and read/write arbitrary files of any connected client. Sensitive data can be extracted, such as OpenPGP and SSH keys, and malicious code can be executed by overwriting files such as ~/.bashrc or ~/.popt.

We highly advise anyone who runs an rsync daemon or client prior to version 3.4.0-1 to upgrade and reboot their systems immediately. As Arch Linux mirrors are mostly synchronized using rsync, we highly advise any mirror administrator to act immediately, even though the hosted package files themselves are cryptographically signed.

All infrastructure servers and mirrors maintained by Arch Linux have already been updated.

source
(author: Robin Candau)
Daniel Lemire's blog
The ivory tower’s drift: how academia’s preference for theory over empiricism fuels scientific stagnation

Almost all of academic science has moved away from actual (empirical) science. It is higher status to work on theories and models. I believe that it is closely related to well documented scientific stagnation as theory is often ultimately sterile.

This tendency is quite natural in academia if there is no outside pressure… And is the main reason why academia should be ruthlessly judged by practitioners and users. As soon as academia can isolate itself in a bubble, it is bound to degrade.

It is worth trying to understand some of the factors driving this degradation… Theoretical work can sometimes be seen as more complex. This complexity can be mistakenly equated with higher intelligence or prestige. Empirical work, while also complex, often deals with tangible, observable data, which might seem more straightforward to the uninitiated.

Empirical work is more likely to lead to nuanced or inconclusive results while theory is often seemingly more direct and definitive. Theoretical research often requires fewer resources than large-scale empirical studies which might need extensive funding for equipment, data collection, and personnel. Thus you get to do more research with less using models and theory.

Theoretical work is often seen as requiring a high level of creativity to devise new frameworks or models. While empirical work also requires creativity in design, execution, and interpretation, the creativity in data collection or experimental design might be less recognized or appreciated.

The educational system often glorifies theoretical knowledge over practical skills until one reaches higher education or specialized training. E.g., we eagerly make calculus compulsory even if it has modest relevance in most practical fields. This educational bias can carry over into professional work.

Society must demand actual results. We must reject work that is said ‘to improve our understanding’ or ‘to lay a foundation for further work’. We must demand cheaper rockets, cures for cancer, software that is efficient. As long as academic researchers are left to their own devices, they will continue to fill the minds of the young with unnecessary models. They must be held accountable.

source
Daniel Lemire's blog
JavaScript hashing speed comparison: MD5 versus SHA-256

source
Matt Keeter
Fidget

Blazing fast implicit surface evaluation

source
(author: Matt Keeter ([email protected]))
Back to Top