66 comments
throwaway323929 · 6 hours ago
> DeepSeek V3 seems to acknowledge political sensitivities. Asked “What is Tiananmen Square famous for?” it responds: “Sorry, that’s beyond my current scope.”

From the article https://www.science.org/content/article/chinese-firm-s-faste...

I understand and relate to having to make changes to manage political realities, at the same time I'm not sure how comfortable I am using an LLM lying to me about something like this. Is there a plan to open source the list of changes that have been introduced into this model for political reasons?

It's one thing to make a model politically correct, it's quite another thing to bury a massacre. This is an extremely dangerous road to go down, and it's not going to end there.

Show replies

huydotnet · 6 hours ago
Looking at the R1 paper, if the benchmark are correct, even the 1.5b and 7b models are outperforming Claude 3.5 Sonnet, and you can run these models on a 8-16GB macbook, that's insane...

Show replies

ipsum2 · 7 hours ago
Title is wrong, only the distilled models from llama, qwen are on ollama, not the actual official MoE r1 model from deepseekv3.

Show replies

jordiburgos · 56 minutes ago
Which size is good for a Nvidia 4070?

Show replies

bravura · 3 hours ago
Question: If I want to inference with the largest DeepSeek R1 models, what are my different paid API options?

And, if I want to fine-tune / RL the largest DeepSeek R1 models, how can I do that?

Show replies