The herd is a collection of three different models, dubbed Behemoth, Scout, and Maverick. Meta says that Behemoth, which is still in development, will be one of the smartest LLMs in the world when it's done. It uses a total of two trillion neural weights, or parameters, which would be the largest amount disclosed publicly by any researchers. Behemoth was then used to create the two smaller models, Scout and Maverick, via an increasingly popular approach known as distillation, where the h
