News Snapshot:
Listen to this story Chinese tech-giant Alibaba is desperately trying to save its cloud business with large language models (LLMs). After releasing Qwen-7B in October, Alibaba recently released Qwen-72B, which has been trained on high-quality data consisting of 3T tokens. Compared to the previous versions, this has a larger parameter size and also an expanded context window length of 32K, with more customisation capabilities. Not just that, the company also added a smaller language model, Qwen-1.8B, touting it as a gift to the research community. It has a 2k context length and requires only 3GB of GPU memory. Both of...